Understanding the human brain is the biggest challenge for scientists in the twenty-first century. The Hodgkin-Huxley (HH) model is one of the most successful mathematical models for bio-realistic simulations of the brain. However, the simulation of HH neurons involves complex computation, which makes the implementation of large-scale brain networks difficult. In this paper, we propose a hardware architecture that efficiently computes a large-scale network of HH neurons. This architecture is based on the neuron machine hardware architecture, which has the limitation of speed as it has only one computation node. The proposed architecture is essentially a non-Von Neumann synchronous system with multiple computation nodes, called hardware neurons, to achieve linear speedup. In this paper, the design of a digital circuit that computes large-scale networks of HH neurons is presented as an example to provide a detailed description of the proposed architecture. This design supports axonal conduction delay of spikes and short- and long-term plasticity synapses, along with floating-point precision HH neurons. The design is implemented on a field-programmable gate array (FPGA) chip and computes a network of one million HH neurons in near real time. The implemented system can compute a network with up to 12 million HH neurons and 600 million synapses. The proposed design method can facilitate the design of systems supporting complex neuron models and their flexible implementation on reconfigurable FPGA chips.