A robust and scalable neuromorphic communication system by combining synaptic time multiplexing and MIMO-OFDM.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 3, MARCH 2014

585

A Robust and Scalable Neuromorphic Communication System by Combining Synaptic Time Multiplexing and MIMO-OFDM Narayan Srinivasa, Senior Member, IEEE, Deying Zhang, and Beayna Grigorian Abstract— This paper describes a novel architecture for enabling robust and efficient neuromorphic communication. The architecture combines two concepts: 1) synaptic time multiplexing (STM) that trades space for speed of processing to create an intragroup communication approach that is firing rate independent and offers more flexibility in connectivity than cross-bar architectures and 2) a wired multiple input multiple output (MIMO) communication with orthogonal frequency division multiplexing (OFDM) techniques to enable a robust and efficient intergroup communication for neuromorphic systems. The MIMO-OFDM concept for the proposed architecture was analyzed by simulating large-scale spiking neural network architecture. Analysis shows that the neuromorphic system with MIMO-OFDM exhibits robust and efficient communication while operating in real time with a high bit rate. Through combining STM with MIMO-OFDM techniques, the resulting system offers a flexible and scalable connectivity as well as a power and area efficient solution for the implementation of very large-scale spiking neural architectures in hardware. Index Terms— Communication, multiple input multiple output (MIMO), neuromorphic systems, orthogonal frequency division multiplexing (OFDM), routing, scalable architecture, spiking neurons, synapses.

I. I NTRODUCTION

T

HE biological brain is a complex, nonlinear system with highly efficient communication within local and between distant brain regions. An important mode of communication is made via action potentials, or spikes, that encode analog information in the interspike interval. There are two major principles by which the brain is organized to enable this communication [1]. The degree of local clustering is the first principle where each neuron is densely connected to neurons within its local neighborhood. If the brain were to communicate serially based on only such local clustering, then it would take thousands of synapses to reach from one part of the brain to another. This would be disadvantageous for the animal to avoid danger in a changing world. With the divergence of an average cortical pyramidal neuron, each neuron can transmit Manuscript received November 27, 2012; revised July 21, 2013; accepted August 27, 2013. Date of current version February 14, 2014. This work was supported by the Defense Advanced Research Projects Agency SyNAPSE under Grant HR0011-09-C-001. N. Srinivasa and D. Zhang are with the Center for Neural and Emergent Systems, Information and System Sciences Department, HRL Laboratories LLC, Malibu, CA 90265 USA (e-mail: [email protected]; [email protected]). B. Grigorian is with the Department of Computer Science, UCLA, Los Angeles, CA 90095 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2013.2280126

information to an average of 104 selected peers [1], [2]. Such connections create a small-world network and form the second principle by which the degree of separation between neurons is reduced. It appears that the brain adopts both these principles to ensure efficient large-scale traffic with minimum amount of wiring [1], [3]. In electronic implementations of brainlike systems (also referred to as neuromorphic systems [4]), as the system size (measured by the number of neurons and synapses) grows, it becomes necessary to organize these neurons and synapses into multiple neuron groups that reside in multiple chips. This is due to the limits on chip die size that can be fabricated using current CMOS processes [5]. Our group developed the synaptic time-multiplexing (STM) concept [17], [23], [58] as a way to address these limitations. Traditionally, the address event representation (AER) [6]–[8] has been the protocol of choice for implementing intergroup communication. AER employs time-multiplexed encoding of spiking data from several groups of neurons into a single communication bus enabling efficient communication. In these designs, transceivers encode and decode spikes over a small set of high-speed wires by encoding each axon with a unique binary representation called an address event. To save hardware real estate, neurons are grouped together to share a common encoder and a decoder. The address packets generated during spike events are transferred and delivered by routers [9]–[11]. These routers have been used in neuromorphic systems such as SpiNNaker [12], [13], CAVIAR [14] and others [64], [65]. The packets are delivered on a neuron-by-neuron basis in the network where the packets are sequentially decoded, searched through a lookup table, delivered to the router, and eventually forwarded to the appropriate target neuron. If there are conflicts during the route assignments, an arbiter is employed to resolve the conflict. This neuron-by-neuron packet delivery approach (also referred to a point-to-point connectivity [6], [14], [24], [64]) can result in the following issues for neuromorphic intergroup communication. First, the data rate and capacity for such neuromorphic network communication can be limited [13]. Second, deadlock and livelock [13], [24], [63], [64] are common problems that can postpone packet delivery forever in the network without reaching the destination resulting in timing errors in spikes. This affects the performance and accuracy for networks that rely on spike timing-based mechanisms such as spike timing-dependent plasticity (STDP) or other timingdependent mechanisms [15]. Third, if the system is subject to

2162-237X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

586


Fig. 1. (a) STM concept to model large-scale neuromorphic architectures shows the breakdown of an STM cycle into separate STM timeslots. (b) Timedomain diagram of a spike signal and an STM cycle is shown here. The STM cycle is ∼10 times smaller than a typical spike period [23]. The STM cycle is composed of multiple timeslots. The synaptic states and routing fabric are updated in each time slot. (c) Abstracted neural fabric design with abstraction of a nodal element (circle in the center) and its associated switching fabric. (d) Simple 3 × 3 grid of abstracted neural hardware showing the neural fabric with nine nodal elements.

traffic jam at a certain node, it could result in failure in system communication [16]. Finally, the lookup table in each node can consume substantial memory to implement the system. Another issue in designing large-scale neuromorphic systems is that of connectivity that means the circuit must be capable of having both short- and long-range connections between neurons. The main question is how to implement a connectivity of 104 synapses/neuron using CMOS technology [17], [18]. One approach to address this question has been to integrate CMOS with nanotechnology [19], [20] to achieve the required synaptic densities. These solutions use crossbar architectures predominantly, but the connectivity challenge remains a daunting task for such solutions [21], [22], [58]. To meet this challenge, our group developed a novel STM approach along with a neural fabric design [17], [23], [58]. The focus of this paper is to describe a novel approach for communication in large-scale neuromorphic systems by combining the advantage of flexible and scalable connectivity STM with multiple input multiple output (MIMO)-orthogonal frequency division multiplexing (OFDM) techniques that enable efficient transmission of spike trains to multiple destinations. This paper is organized as follows. In Section II, a description of the three fundamental building blocks STM, OFDM, and MIMO is provided. In Section III, the hardware platform and architecture design that combines MIMO and OFDM for neuromorphic systems are described along with a discussion

of features that make this design a desirable choice for neuromorphic communication. System analysis was conducted to evaluate the performance of the proposed communication system as described in Section IV. An example design of a neuromorphic system that combines STM and MIMO-OFDM is described in Section V. In Section VI, various features and extensions of the proposed communication system are discussed and the approach is contrasted with other well-known neuromorphic systems. Concluding remarks are provided in Section VII. II. BACKGROUND A. Synaptic Time Multiplexing Typical neurons in the brain have a firing rate that ranges from 0.5 to 100 Hz, with momentary excursions to higher or lower frequencies [1]. In contrast, modern electronics have grown as discovered in [4] by exponentially increasing the clock speed (in the gigahertz range) and by increasing transistor density. The key idea in STM is to exploit this difference in operating speed between electronics and mammalian brains and tradeoff space for speed of processing to address the scalability and connectivity challenges as described in [23]. As illustrated by a simple example in Fig. 1(a), the set of three decoupled networks on the left, in a three-timeslot sequence, provides the same set of connections as the neural

SRINIVASA et al.: ROBUST AND SCALABLE NEUROMORPHIC COMMUNICATION SYSTEM

network with high synaptic connection density shown on the right. Through integrating all the synaptic inputs of a given neuron in a sequence rather than in parallel, the STM concept reproduces the fully connected network while reducing the hardware requirements to only a few physical synapses/neuron and storage of the other synapse states. During this process, the sequential steps are operated at a much higher frequency than the maximum brain operating speed. This operating frequency is referred to hereinafter as the STM frequency. The set of STM timeslots needed to describe all the synapses is referred to as the STM cycle, as shown in Fig. 1(b), and its cycle time determine the STM frequency. This feature enables decoupled networks to be processed sequentially at each STM timeslot until all of them are covered. The sum of the duration of all these STM timeslots will make up the total system time or STM cycle and all the network connections for the complete network are realized at the end of the STM cycle. To realize the concept of STM in CMOS, we have developed a neural fabric design that is part of the analog core and amenable to support spiking neurons [17], [23], [58]. An example of this fabric design is shown in Fig. 1(c). It consists of a network of nodal elements and fabric switches. In each nodal element, there are the following components [17], [23], [58]: 1) a neuron; 2) synapses with STDP; 3) wires and switches for routing; 4) local analog memory; and 5) local digital memory for storing connectivity information between a local group of neurons. The neuron is not time multiplexed and it operates in continuous time. Each physical synapse is time multiplexed to implement multiple virtual synapses. The number of virtual synapses implemented by each physical synapse is equal to the number of STM timeslots in one STM cycle [17], [23], [58]. To route spikes between neurons within different nodal elements, the neural fabric consists of axonal grid lines with additional diagonal fabric switches as shown in Fig. 1(c). The common feature between AER and STM is that both use time division multiple access links to save on routing hardware. AER uses a bus architecture where processing units (neurons) use encoders and decoders to communicate with each other using a fixed communication path. STM, on the other hand, is more like a set of direct point-to-point connections that needs to be modified several times before each processing unit has all their inputs. Unlike AER, the STM approach is not spike event driven and the routing fabric is updated after each timeslot. However, in the STM approach, the address to encode destination of spikes is not required and routing is triggered based on a global STM clock. In STM, point-to-point connection is made to send parallel signals whereas in AER a common bus is connected to all the nodes to send a common signal. Another important difference between AER and STM is that the STM is not affected by the firing rate of the network, but AER is affected [24]. This can result in traffic jams and timing errors. The downside of STM is, however, the constant switching of the routing fabric that can be power consuming but this mechanism allows the spike transmission to be insensitive to the firing rate. It is possible to optimize the amount of switching to reduce power

587

consumption and we will highlight how this can be realized in Section V when STM is combined with MIMO-OFDM. For further details on STM, its hardware implementation can be referred in [17], [23], [37], and [58]; its comparison with other neuromorphic communications approaches with and without AER can be found in [23] and [58]. B. Orthogonal Frequency Division Multiplexing OFDM is a broadband multicarrier modulation method based on the concept of FDM that enables the transmission of multiple data streams over a common broadband medium. OFDM offers superior performance over traditional single-carrier modulation methods and has become a widely used technique for various medium including the radio spectrum [25]–[30], optical [31], [32], underwater acoustic [33], radar [34], and cable systems [35]. An OFDM transmitter divides a broadband channel into many narrow-band, low-rate, and frequency nonselective subchannels or subcarriers. These subcarriers are orthogonal to each other so as to maximally reduce intercarrier interference (ICI). Therefore, multiple data or symbols can be transmitted in parallel while maintaining a high spectral efficiency. Each subcarrier may also deliver information to a different user, resulting in a simple multiple access scheme known as OFDM access, which enable different media such as video, graphics, speech, text, or other data to be transmitted using the same link independently and in parallel. In this paper, this idea is exploited to design a novel neuromorphic intergroup communication system where each neuron group communicates to other neuron groups through OFDM-based spike train transmission and reception techniques as will be described in Section III. C. MIMO Systems MIMO is effectively a radio antenna technology [28], [36] that uses the idea of space-time signal processing in which time (the natural dimension of digital communication data) is complemented with the spatial dimension inherent in the use of multiple spatially distributed antennas (i.e., the use of multiple antennas located at different points). This principle of diversity in MIMO systems provides the receiver with multiple versions of the same signal. If these multiple versions are affected in different ways by the signal path, the probability that they all will be affected at the same time is considerably reduced. Accordingly, diversity helps to stabilize a signal path and improve performance thereby reducing error rate. Unlike wireless systems where signals travel long distances and are affected by signal fading and interference, the neuromorphic routing systems described in this paper are designed with wired routing channels with relatively short distances over which signals are transmitted. This makes the proposed neuromorphic system simpler and obviates the need for channel estimation, error correction coding schemes, and space-time frequency coding. In the proposed approach, each neuron group transmits spike timing data to a central router. After data processing and reorganization in the central router, the central router transmits the various spike timing data to each neuron group via transmitters in the router. The interface

588


between central router and neuron groups forms the MIMO unit in the proposed neuromorphic system. Furthermore, each neuron group can be treated as a specific user by the central router. Combining OFDM techniques for signal modulation with a MIMO interface create a MIMO-OFDM platform for intergroup communication. D. Neurons, Synapses, and STDP Computations To understand the neuromorphic system that is being considered in this paper, it is important to describe the various neural and synaptic computations that occur within each neuron group. The neuron group will be based on spiking neural networks where the neurons integrate incoming synaptic currents and fire actions potential, or spikes, when the net integrated current exceeds a threshold. An example of spiking neuronal dynamics and electronic implementations of these neurons can be found in [37]. The synapse is the junction between two interconnected neurons. One terminal of the synapse is associated with the neuron providing information (referred to herein after as the presynaptic neuron). The other terminal is associated with the neuron receiving information (referred herein after as the postsynaptic neuron). The synapses with synaptic conductance were internally adjusted according to STDP discovered in the brain [38]–[40]. The STDP modulates the synaptic conductance based on the timing difference between the spikes of presynaptic neuron and postsynaptic neuron. As shown in Fig. 2(a), if the timing difference is positive, then the synapse undergoes depression or a reduction in conductance w. If the timing difference is negative, then the synapse undergoes potentiation or an increase in conductance. The dynamics of potentiation P and depression D values change [magenta and yellow traces in Fig. 2(b)] at a synapse as dictated by two exponential decay functions with time constants τ + and τ − of the STDP curve, respectively ((2) and (3) in [37]). When a postsynaptic neuron fires a spike, D is decremented by an amount A− [Fig. 2(b)] relative to the current value of D. Similarly, every time a synapse receives a spike from a presynaptic neuron, P is incremented by an amount A+ relative to current value of P as shown in Fig. 2(b). The net resultant change in the synaptic conductance w is based on the difference between P and D. This change w is added to the current value to produce a new synaptic conductance w, as shown in Fig. 2(b). In a large-scale neuromorphic system, neurons may reside in different neuron groups and have to communicate with each other via interconnect between or within the groups. In this case, the P value is transmitted as part of the spiking event information to the receiving neuron to facilitate STDP. The D value is computed at each neuron locally and thus, does not have to be transmitted between neuron groups. Further details of a possible neuromorphic circuit implementation of synapse and STDP can be found in [37]. In addition to the spike timing information, the specific address of the postsynaptic neuron that is to receive the spike is also included as part of the spiking event information that is to be transmitted to another neuron group. The details of the neural and synaptic

computations and routing of spikes within a group can be found in [23], [37], and [58]. III. I NTERGROUP C OMMUNICATION U SING MIMO-OFDM The hardware design combines the features of MIMO and OFDM to construct a novel intergroup communication for neuromorphic systems. The proposed architecture for this system is inspired by the thalamocortical system found in the mammalian brain [1]. In particular, the dense local connectivity found in the laminar layers of cortex can be analogous to a neuron group while the router could be analogous to the thalamus that serves to make connections between various cortical areas of the brain [Fig. 3(a)]. Fig. 3(b) shows the hardware platform and architecture of the proposed neuromorphic system. The system consists of one central router and Q neuronal groups (NGs). The connectivity within each NG in the system mimics a local region in the brain and exhibits dense local clustering. The communication between neurons within any NG (or intragroup communication) can be realized using the routing and STM method that was recently described in [23] and [58]. We will now describe the novel intergroup communication system based on MIMO-OFDM. A. Overview of MIMO-OFDM System Architecture For the system shown in Fig. 3(b), we assume that Q = 100. Each NG houses neurons, synapses, and STDP learning circuits (similar to those in [37]). They also contain a transmitter (Tx) and a receiver (Rx) to facilitate intergroup communication. The signal transmission and reception are enabled using OFDM techniques as follows. The spike event data are transmitted from a presynaptic neuron in the i th NG to the j th NG using OFDM signal Ai j . This signal is encoded as a U bit word composed of two subwords. The first is a Y bit subword that encodes the P value that is required for making synaptic conductance changes at the synapse of the receiving postsynaptic neuron based on STDP. It is also composed of another Z bit subword that encodes the unique address of the recipient neuron within the neuromorphic system. Each spike event is then mapped into S symbols through digital modulation via binary phase-shift keying (BPSK) or quadratic phase shift keying (QPSK) [36] method. The details on these modulation schemes in the context of neuromorphic system design will be provided in the next section. Each NG can transmit to M other neurons distributed within other NGs. In all, there will be a total of M ∗ Q (herein after referred to as MQ) such U bit words or M ∗ S ∗ Q (herein after referred to as MSQ) symbols that need to be transmitted throughout the neuromorphic system. These MSQ symbols after modulation are subjected to an inverse discrete Fourier transform (IDFT) [36] creating a signal frame consisting of M ∗S (hereinafter referred to as MS) OFDM symbols that is transmitted using the Tx in each NG to the central router using a single wire. The receiver Rx in the central router applies the DFT to recover the original M sets of S symbols. This approach is scalable and can implement very large-scale brainlike architecture for two reasons. The first


589

Fig. 2. STDP learning process is described here. (a) STDP curve that shows the relationship between change in weights (w) and the spike timing difference. The time constants τ + and τ − are 30 and 45 ms, respectively. The values for A+ and A− are 20 and −10 mV, respectively. The reader is referred to [37] for more details on the electronic circuit implementation. (b) Dynamics of the internal states P (magenta trace) and D (the yellow trace) values and the dynamics of the total synaptic weight w are shown here as a function of pre and postsynaptic spikes. When the P trace is behind the D trace, the change in weight is negative, while it is positive for the opposite case. The w trace is a continuous function of the P and D values and its turn toward positive direction begins at the time at which the D trace ascends to its peak value while its turn to the negative direction begins at the time at which the P trace ascends to its peak value. There is no change when both P and D traces are falling.

is that OFDM offers the best spectral efficiency measured in bit per second/cycle per second and results in achieving the maximum data capacity out of a channel [36]. The second is that the transmission to other NG requires only two wires (one for Tx and one for Rx) thereby minimizing the wiring space required for routing in the neuromorphic system. The central router [Fig. 4(a)] is composed of one central switching controller and one MIMO unit. The MIMO unit consists of Q transmitters and Q receivers [26], [36]. After receiving and demultiplexing the spike event data from each NG within the central router, the data are rearranged using a central switching controller [Fig. 4(b)]. This controller enables

all spike events in the form of U -bit words to be routed to Tx within the central router. Each Tx pools all spike inputs to the destination NG and then transmits the composite U -bit word to it. This process occurs in parallel for all NGs within the neuromorphic system. At the destination NG, the composite signal from the central router is received by corresponding Rx and then demultiplexed using DFT into N-bit words. These words are then routed to destination neurons within the NG. It should be noted that the simplest version of the controller [Fig. 4(b)] essentially provides wired connections to each of the Tx thereby implementing a routing table within the central router.

590


Fig. 3. Overall architecture of the proposed MIMO-OFDM architecture can be based on the brain architecture. (a) Each neuronal group can correspond to a macrocolumn (small circle) which has a laminar six layered architecture as shown here. (b) Architecture for a Q-neuronal group neuromorphic system will then correspond to a thalamocortical system as shown here. Each neuronal group transmits OFDM signal via single wires to a central controller and receives an OFDM signal that is decoded at the entry to the neuronal group before being routed as spikes within it. The central router is capable of composing the appropriate OFDM packets from the received transmissions from various neuronal groups.

This wired multiuser MIMO-OFDM neuromorphic hardware platform is a real-time routing system for spike events. Spike events that occur in different NG are routed and delivered independently and in parallel. This feature makes the system tolerant to failures. The data rate and capacity can also be higher than current node-by-node scheme. These advantages will be illustrated in Section IV via computer simulations of a biologically plausible spiking model using a GPGPU cluster simulator program called HRLSim [43]. B. Operational Details of MIMO-OFDM System Hardware In each NGU binary bits represent a spike event. These binary bit streams generated within a NG are transmitted by a set of subcarriers to a corresponding set of target groups. In each subchannel, the spike event bit stream is mapped into symbols through BPSK or QPSK. For BPSK, each bit in the bit stream corresponds to one symbol. Thus, there are a total of M symbols. The value of each symbol is 1,

if the value of corresponding bit is 1. Otherwise, the value of symbol will be −1. For QPSK, each of the two bits in the bit stream will be mapped into one symbol, the first bit for the real value of the symbol and the other for the imaginary value of the symbol. The value of real or imaginary part of the symbol will be 1, if the corresponding bit value is 1. Otherwise the value of the real or imaginary part of the symbol is −1. Therefore, for each U -bit spike event, there are U symbols for BPSK modulation or U /2 symbols for QPSK modulation. Separate OFDM subcarriers encode these symbols. Let N is the size of fast Fourier transform (FFT) and inverse fast Fourier transform (IFFT) for the OFDM. Let CR [k] and x CR [k] denote the mth OFDM symbol at kth x qm qm subcarrier in the qth group, CR denotes symbols being transmitted from the group to central router whereas RC denotes symbols being transmitted from the central router to the group, CR [k], are transmitted from respectively. OFDM symbols, x qm group to the central router using the K OFDM subcarriers. These superscripts and subscripts will be used hereinafter for clarity.


591

Fig. 4. (a) Central router for processing and routing spike events as OFDM signals Ai j between various NGs is shown here. (b) Close up view of the central switching controller shows the wired connections between various T x and Rx that serve to route the spikes based on the connections between neurons from different NGs. CR (n) for the nth Each time domain OFDM symbol, Sqm symbol in N, is generated by IFFT in the transmitter of the qth group can be expressed as follows:

Thus, an OFDM symbol from the qth group can be represented using a column vector as CR T →Cr − CR CR S qm = Sqm (0)Sqm (1) · · · Sqm (N − 1) (2)

symbol. CP extends the OFDM symbol by padding the last L samples of the OFDM in front of the original OFDM of length N. This serves to provide a guard interval for OFDM symbols that is longer than the delay in the multipath channel. It also ensures that the samples in the subcarriers are orthogonal and thus, helps to overcome ISI. After adding CP, each OFDM symbol for the qth group can be represented as CR − →CR CR CR CR (−L) · · · Sqm (−1) Sqm (0) Sqm (1) · · · S qm = Sqm T CR Sqm (N − 1) . (3)

where T represents the transpose operator. To overcome the intersymbol interference (ISI) during transmission, the length P of cyclic prefix (CP) [36] is added in front of each OFDM

These OFDM symbols are transmitted to the central router through the MIMO interface between the NGs and the CR , received central router. The OFDM symbol vectors, Rqm

CR Sqm (n) =

K −1

CR x qm [k] · e j

2π N

k.n

.

(1)

K =0

592


Fig. 5. (a) Sequence of computational steps (represented by the four blocks) that transforms each spike event from a chip into an OFDM signal is shown on the left of the channel (pink box in the middle). The channel data is then sequentially processed (represented by the three blocks) to recover the digitally modulated signals inside the central router as shown by the three boxes to the right of the channel. The mathematical details of the computations are described in more detail in the text. (b) Sequence of computational steps that transforms each demodulated spike event in the central router into an OFDM signal is shown on the left of the channel box. The channel data are then sequentially processed to recover the digitally demodulated signals inside the chip into spike data that are sent to various neurons within the chip. The mathematical details of the computations are described in more detail in the text.

by the central router through expressed as ⎡ CR ⎤ ⎡ CR h 11 (τ, t) hCR R1m 12 (τ, t) · · ⎢ ⎢R CR ⎥ ⎢hCR (τ, t) hCR (τ, t) · · 22 ⎢ 2m ⎥ ⎢ 21 ⎢ · ⎥ ⎢ · · · ⎥=⎢ ⎢ ⎢ · ⎥ ⎢ · · · ⎥ ⎢ ⎢ ⎣ · ⎦ ⎣ · · CR CR RQm hCR Q1 (τ, t) h Q2 (τ, t) · ·

various receivers can be

not an issue as the transmitter can transmit signals without the need to amplify at high gain. Thus, the noise in the ⎤⎡ neuromorphic system is linearly additive and AWGN serves CR ⎤ · hCR S1m 1Q (τ, t) as an appropriate model. In (4), W CR is the matrix for identi⎥ ⎢CR ⎥ · hCR (τ, t) S ⎥ ⎢ 2m ⎥ cally and independently distributed AWGN for each channel. 2Q ⎥ · ⎥ · ⎥⎢ CR Since each Tx is connected by wires to its correspond⎢ ⎥ +W ⎥∗ · ⎥ · ⎥⎢ ing Rx in the central router, (4) can be simplified as ⎢ ⎥ ⎥ · · ⎦⎣ · ⎦ follows: CR ⎡ CR ⎤ ⎡ CR ⎤ S · hCR (τ, t) Qm QQ 0 ··· 0 R1m h 11 (τ, t) (4) ⎢ R CR ⎥ ⎢ ⎥ 0 hCR 0 22 (τ, t) · · · ⎢ 2m ⎥ ⎢ ⎥ ⎢ · ⎥ ⎢ ⎥ · · · · ⎥ ⎢ ⎥ ⎢ where ‘*’ represented the discrete convolution operation =⎢ ⎥ ⎢ ⎥ · · · · · CR ⎥ ⎢ ⎥ ⎢ between two vectors, h qm (τ, t) represents the impulse ⎣ · ⎦ ⎣ ⎦ · · · · response of the multipath channel between transmitter in the CR CR (τ, t) 0 0 · · · h R Qm QQ qth group and the q ,th receiver in the central router, and τ ⎡ CR ⎤ ⎡ CR ⎤ and t are the delay and time of the channel impulse response W1m S1m ⎢ SCR ⎥ ⎢ W CR ⎥ due to multipath time and frequency fading channel effect, ⎢ 2m ⎥ ⎢ 2m ⎥ ⎢ ⎥ ⎢ respectively. Major sources of corruption of channel signals · ⎥ ⎢ · ⎥ ⎥ (5) ∗⎢ ⎢ · ⎥+⎢ · ⎥ are noise (thermal or due to interfering signals), multipath ⎥ ⎢ ⎥ ⎢ ⎣ · ⎦ ⎣ · ⎦ propagation that leads to ICI, and nonlinear distortion induced CR by operating the transmitter’s power amplifier in the high-gain CR SQm W Qm region. The additive white Gaussian noise (AWGN) channel CR qm is the AWGN vector for each channel. The model [36] is a Gaussian model for communication systems where W that is used to model various sources of white noise such AWGN vector can be further expressed in terms of CP as as thermal noise that are common in electronic circuits. For follows: CR →CR − the neuromorphic system under consideration in this paper, CR CR CR W qm = Wqm (−L) · · · Wqm (−1) Wqm (0) Wqm (1) · · · because different NGs are independent of each other and T CR Wqm (N − 1) . (6) are connected via wired connections, nonlinear distortion is


593

CR , in the central router can be The received signal vector, Rqm expressed in the form of a column vector as CR →CR − CR CR CR R qm = rqm (−L) · · · rqm (−1) rqm (0) rqm (1) · · · T CR rqm (N − 1) . (7)

RC (n), in each of its corresponding domain OFDM symbols, Sqm Tx in the central router through IFFT as follows:

Since the channel is a static channel (because of the wired connection) for this implementation of the neuromorphic system, the vector for channel impulse response can be simplified and expressed as a column vector as follows: CR →CR − → − CR CR h qq (τ, t) = h CR q (τ ) h q (−L) · · · h q (−1)h q (0) T CR h CR (8) q (1) · · · h q (L − 1) .

where n represents the nth sampling period. After adding CP in front of each OFDM symbol, they are transmitted to different NGs through the MIMO interface in the central router. Because the wired channel between central router and groups is a static Nyquist filter, with similar process as above, RC (n), received by each NG can be the OFDM symbols, rqm expressed as

To further reduce ISI for OFDM, square-root-raise-cosine FIR filter or square wave filter may be used for pulse shaping (PS). In either case, the channel behaves as a Nyquist filter [41], [42]. With the intrinsic property of Nyquist filter, the impulse response of the channel can be expressed as h CR q (n) =

h CR q (0) 0

n=0 otherwise.

(9)

Therefore, the OFDM symbols received by the central router can be expressed in a simplified form as CR (n) rqm

=

L−1

RC sqm (n) =

(10)

where n can take values between −L to N − 1. The entire process of signal transmission from group to central router is shown in Fig. 5(a). After removing CP from the received OFDM symbols and CR [K ], received by performing FFT on them, the symbols, Yqm central switch unit in the central router can be expressed as k−1 2π 1 CR [K ] = rqm (n) · e− j N k.n . N

(11)

Based on (1), (10), and (11), the following equations can be established:

CR [K ] = wqm

1 N

k−1

CR wqm (n) · e−i

2π N

k.n

.

(12) (13)

n=0

Equation (12) shows that even if the symbols were to be degraded by AWGN, the symbols for information bits of spike event are simultaneously and independently transmitted from each group and received by central router in parallel. After arriving at the central router, the symbols for spike events are rearranged as RC CR [K ] = ykm [q] x qm

k.n

(15)

RC rqm (n) =

L−1

RC RC h RC q (l) · sqm (n − 1) + wqm (n)

l=0 RC CR = h RC q (0) · Sqm (n) + wqm (n)

(16)

where h RC q (l) is the channel impulse response for each channel RC (n) is the between the central router to neuron group, and wqm AWGN for each channel from central router to neuron group. Through a similar process as described above (for signal RC [k], received by transmission), symbols for spike events, Yqm each group can be expressed as

RC [k] = wqm

1 N

N−1

RC wqm (n).e− j

2π N

k.n

(17) (18)

n=0

where the key difference compared with (12) and (13) is the direction of signal transmission (i.e., CR for symbols transmitted from the NG to central router and RC for symbols being transmitted from the central router to NG). Using (14), (17) can be further expressed as RC CR CR RC km RC [k] = h RC yqm q (0).h q (0).x km [q] + h q (0).wCR [q] + wqm [k].

n=0

CR CR CR Yqm [K ] = h CR q (0) · x qm [K ] + wqm [K ]

2π N

k=0

l=0

CR Yqm

RC x qm [K ] · e j

RC CR RC Yqm [K ] = h RC q (0).x qm [K ] + wkm [k]

CR CR h CR q (l) · sqm (n − 1) + wqm (n)

CR CR = h CR q (0) · Sqm (n) + wqm (n)

k−1

(14)

where all the transmissions to be dispatched to a given destination group q is regrouped into a single Tx in the central router based on the particular subcarrier allocated to that destination group. Let the sample period for the system be the duration Ts . RC [K ], are used to generate time The rearranged symbols, x qm

(19) Equation (19) shows clearly that each NG simultaneously and independently receives information bits of spike events from different NGs through central router in parallel. Due to AWGN RC [k] = w RC [q] = w. The signal-to-noise ratio assumption, wqm qm for each neuron group can be expressed as 2 CR RC [q] 2 h RC x km q (0).h q (0) SNR = . . (20) h RC w q (0) + 1 Equation (20) shows that even though the noise in each channel degrades the symbols for spike event information, the SNR can be improved by adding a scale multiplier in each channel CR with the value of scale >2 (i.e., h RC q (0) = h q (0) h > 2) such that the overall scale (i.e., the first term) will be >1. The entire process of signal transmission from central router to each neuron group is shown in Fig. 5(b). For BPSK and QPSK, the probability of bit error, Pb , which is the expected value of bit error rate (BER), can be expressed as follows: 2E b (21) Pb = F NO

594


Fig. 6. (a) E-I neural network model was simulated at each NG with E > E, E-> I , I -> E, and I -> I synapses. The synapses obeyed the STDP rule while the neurons were of the integrate-and-fire neuron type. (b) RAIN like background activity of the E-I network is shown here for a 1-ms time window for a 10 000 neurons of a NG chip. The network has 80% E neurons and 20% I neurons as shown in the figure.

where E b is the energy/bit, N0 /2 is the noise power spectral density, and F is a scaled form of the complementary Gaussian error function [36]. E b /N0 is proportional to SNR. Therefore, the performance of BER will be improved with higher SNR. IV. C OMPUTER S IMULATIONS OF MIMO-OFDM C OMMUNICATION Computer simulations were performed using a biologically plausible neuromorphic system that satisfied a set of design constraints. These design constraints are first introduced. This is followed by computer simulation of various modes of intergroup communication using the proposed MIMO-OFDM system. A. Establishing a Baseline Design Requirement To establish a baseline design requirement for our intergroup communication, we impose a set of design constraints based on existing knowledge about brainlike architectures. In the simulations, it is assumed that the combination of STM and intragroup routing will be used for managing all intragroup communication. The first design constraint is that each NG models a small cortical area in the brain consisting of ∼105 neurons (size of a small cortical area such as a macrocolumn [1]) with a connectivity of 109 time-multiplexed synapses. This connectivity of 1:10 000 matches that of the human brain [1]. As a second constraint on the design, it is assumed that for the baseline requirement most of the synaptic connections (∼70%) is within the NG while the remaining connections are sparse (∼30%) connections to all other NG (brain areas) via a small-world network of connections [3]. The third

design constraint is that each neuron is assumed to be the spiking integrate-and-fire type and each synaptic connection is assumed to obey STDP as discussed above. The fourth and fifth design constraints are that the average and peak firing rate of the neurons in the neuromorphic system are 10 and 100 Hz, respectively. The sixth design constraint is that the neuromorphic system sampling time is 1 ms. Furthermore, it is assumed for simplicity that the intergroup spike transmission is uniformly distributed among all NGs (the more complex nonuniform distribution of spike transmission is discussed in Section VI). Assuming a peak firing rate of 100 Hz, there will be a total of 104 spikes (i.e., 105 neurons/NG × 100 spikes/s × 10−3 s/ms) that needs to be transmitted by each NG once every millisecond. Through imposing the second design constraint, a baseline design requirement is that each NG transmits 7000 spikes (due to 70% connections) within group while the remaining 3000 spikes (due to the remaining 30% connections) are transmitted to various NGs outside once every millisecond. B. Simulation Results To generate spike traffic for this biologically plausible neuromorphic system, the baseline design requirement was imposed to a simulation of a generic spiking neural implementation with 100 NG in the system. Computer simulations of this system were first performed using a spiking model simulated using a GPGPU cluster simulator program called HRLSim [43]. The input in the form of spike injections was provided to a set of randomly selected neurons in each NG within the system. The resulting spike activity of the system exhibited recurrent asynchronous intermittent network


595

Fig. 7. (a) Effect of AWGN on BER of the proposed interchip communication is shown here for four different cases. Note that PS stands for pulse shaping and NPS stands for no-pulse shaping. (b) Effect on system BER due to peak power clipping for OFDM with 12-dB AWGN channel is summarized for same four cases shown in Fig. 7(a).

(or RAIN) dynamics. Such dynamics are consistent with biology [44]. The intergroup communication traffic (that are analogous to either cortico-cortical or thalamocortical spike events) was collected from the model simulations on the cluster computer. The resulting data stream of spike events for a time window of 1-s sampled at every 1 ms (shown in Fig. 6) was imported into a MATLAB model of the MIMO-OFDM system (Appendix for code). The system was then simulated and analyzed for its performance in routing these spike events under various conditions as described next. In the MATLAB simulations performed, each spike event was encoded into U = 32 bits. During every millisecond, each NG roughly generates between 200 to 3000 spike events based on RAIN dynamics. Therefore, the maximal bit rate for

each NG is 96 Mb/s (i.e., 3000 spikes/ms × 32 bits/spike × 103 ms/s) and, for 100 NGs, the maximal bit rate for the whole system will be 9.6 Gb/s. For the design proposed in this paper, the sampling period for each IDFT or DFT is set as Ts = 5 ns (or 200-MHz sample frequency). The IDFT and DFT are implemented by FFT algorithm with size of N = 128. For each FFT, K = 100 subcarriers (i.e., one for each NG) are used to transmit and receive spike events. To avoid ISI among different OFDM symbols, L = 32 point CP is inserted before each OFDM symbol after IDFT and the CP is removed before each DFT [26]–[28]. Because the sampling frequency is 200 MHz, the maximal bit rate for each NG will be 200 Mb/s, and the maximal bit rate for the whole system will be 20 Gb/s.

596


Fig. 8. Simulations of the MIMO-OFDM system that show the effects of STO on BER are identical for either (a) 12-dB AWGN alone or (b) due to the combined effect of 12-dB AWGN and 3-dB of PPC.

The MIMO-OFDM is assumed to have short (0.03, which correspond to 150-ps start time offset in the neuromorphic system for a 200-MHz sampling frequency. The effect of sampling step error was also studied as shown in Fig. 9(a) through Fig. 9(d). In all these simulations, E b /N0 = 12 dB for the AWGN channel. Fig. 9(a) shows a benchmark plot without any PPC, STO, or sampling step errors but only AWGN channel effects. In Fig. 9(b), the PPC for the system was fixed at 3 dB. In Fig. 9(c), the STO of the system was fixed at 0.01, which corresponds to 50 ps for 200-MHz sampling frequency. In Fig. 9(d), the PPC and STO were 3 dB and 0.01, respectively. Comparing the results from all these test cases

clearly demonstrates that the system BER performance is not affected at all. When the sampling step error is >0.0001, the system BER begins to deteriorate causing more errors but that corresponds to 500-fs clock jitter for a 200-MHz sampling frequency. For current commercially available off-the-shelf (COTS) products, clock source can achieve four-independent 1.2-GHz LVPECL outputs with root-mean-square (rms) additive output jitter of 225 fs and four-independent 800/250-MHz LVDS/CMOS clock outputs with rms additive output jitter of 275 fs [45]. With advanced synchronization techniques [46], the STO for OFDM can be better than 1%. Therefore, the neuromorphic system described in this paper can potentially be implemented with COTS to achieve good performance. In Fig. 10(a), the simulation shows the combined effect on the BER due to AWGN channel effects, PPC of 3 dB, STO of 0.01, and sampling step error of 0.0001. This case was based on the baseline requirement. Even though, compared with Fig. 7, the system BER performance is degraded due to circuit limitation and imperfection, for E b /N0 = 15 dB, a BER of 40 GHz. Since the signal frequencies are never that high in neuromorphic systems, frequency related effects that cause problems for DSL can be safely ignored for neuromorphic systems. To evaluate the suitability of the proposed neuromorphic system for practical applications, it is important to understand the power and area costs for implementation. Since the proposed system is not affected by multipath channel

604


time or frequency fading effects that are common in wireless communication systems such as WiMAX [28], the hardware requirements are simpler. This is because it obviates the need for channel estimation, channel equalization, and forward error coding and decoding. The two major components that affect the power and area budget of the system are the DFT and IDFT cores. In a recent design for OFDM applications [28], the power consumption for a 256-point DFT/FFT with 25-MHz sample frequency was estimated to be 2.96 mW with an area of 397 × 388 μm2 using a 0.18-μm CMOS process. Therefore, without any further improvements or optimizations, a modest estimate of power and area for the proposed 100 NG neuromorphic intergroup communication system (with a total of 107 neurons and 1011 synapses) with a 200-MHz sample frequency, a 256-point DFT/IDFT [52], [54], and using a 0.045-μm CMOS process [53] is ∼10 W with an area of ∼3 × 3 mm2 . Since the system is not very sensitive to the sample rate frequency [as shown in Fig. 14(a)], it may be desirable to trade sample frequency for power and area savings. A more aggressive estimate of power and area for the same system with a 25-MHz sample frequency can be ∼1 W with an area of ∼1 × 1 mm2 . The computer simulations in this paper assumed that the number of spike events transmitted to each NG is uniformly distributed. This assumption can be relaxed by adopting techniques of reconfigurable FFT/IFFT [55] or subchannelization resource allocation for mobile WiMAX [29]. In particular, it is possible to combine either one of these techniques with channel coding or spike traffic sensing to enable the system to adaptively handle nonuniform spike traffic distribution between the NGs in the system. First, a spike traffic sensor can be added before each transmitter in the NG and central router to estimate an initial subcarrier assignment. Channel coding can be used after this step for further identification and final assignment of nonuniform subcarrier allocation as shown in Fig. 14(b). The proposed neuromorphic system can be scaled into larger neuromorphic systems by drawing inspiration from the brain [1] and then designed using the concept of cellular mobile communication [56]. The brain is organized into several loops such as thalamocortical and corticostriatal loops [1], [57] (Fig. 15) that is integrated together into one very large network architecture. In this system, each neuromorphic routing system for within the group is enabled by the STM while local routing between NGs will serve as a cellular or a local area for spike event transmission. The spike event transmission among these cellular regions will be routed through one central office among them as shown in Fig. 15, where “RO” stands for spike routing from central router to central office and “OR” stands for spike routing from central office to central router. We have designed, fabricated, and tested various components of our proposed neuromorphic system beginning with the neurons, synapses, and STDP circuits [37]. We have also subsequently integrated these circuits into a large scalable chip that includes the routing fabric to perform STM [58]. We believe that the proposed novel approach in this paper will provide the basis for future large-scale neuromorphic systems that require multichip communication. We plan to extend the

chip described in [58] into a multichip neuromorphic system in the future. VII. C ONCLUSION This paper described a novel approach for communication in large-scale neuromorphic systems by combining local firing rat E-I ndependent routing based on STM with MIMO-OFDM techniques to enable parallel transmission of spike trains to multiple destinations. The system has an excellent tolerance to failure since the spike event address packets from each NG can be transmitted fully independently and in parallel. As long as there are adequate time slots, spikes can be accurately transmitted to every neuron in the system. Due to OFDM techniques, spike event address packets that need to be transmitted to different destination NGs are multiplexed and demultiplexed by both TDM and FDM. Thus, the number of bits for each spike event representation is significantly reduced. Computer simulations of a 100 NG system with a total of 107 neurons and 1011 synapses using a combination of STM and MIMO-OFDM show that the system is capable of delivering spike event address packets in real time with a data rate of up to 200 Mb/s for each NG and 20 Gb/s for the whole neuromorphic system with a maximum of 1 bit error for the entire system in a 1-ms time window. Analysis of power and area for its implementation using CMOS process technologies shows that the architecture is power and area efficient. The proposed architecture offers a novel alternative to the current standard based on AER for designing efficient neuromorphic intergroup communication. ACKNOWLEDGMENT The authors would like to thank J. Cruz-Albrecht and P. Petre for contributions in STM concepts.

A PPENDIX A


605

606


R EFERENCES [1] G. Buzsaki, Rhythms of the Brain. Oxford, U.K.: Oxford Univ. Press, 2006. [2] P. Erdos and A. Renyi, “On the evolution of random graphs,” Publ. Math. Inst. Hungarian Acad. Sci., vol. 5, pp. 17–61, 1960. [3] O. Sporns and J. D. Zwi, “The small world of the cerebral cortex,” Neuroinformatics, vol. 2, no. 2, pp. 145–162, 2004. [4] C. A. Mead, Analog VLSI and Neural Systems. Reading, MA, USA: Addison-Wesley, 1989. [5] H. Veendrick, Nanometer CMOS ICs: From Basics to ASICs. Berlin, Germany: Springer-Verlag, 2008. [6] K. A. Boahen, “Point-to-point connectivity between neuromorphic chips using address-events,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 47, no. 5, pp. 416–434, May 2000.

[7] P. A. Merolla, J. V. Arthur, B. E. Shi, and K. Boahen, “Expandable networks for neuromorphic chips,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 54, no. 2, pp. 301–311, Feb. 2007. [8] C. Bartolozzi and G. Indiveri, “Selective attention in multi-chip addressevent systems,” Sensors, vol. 9, no. 7, pp. 5076–5098, 2009. [9] J. Aweya, “On the design of IP routers part 1: Router architectures,” J. Syst. Archit., vol. 46, no. 6, pp. 483–511, 2000. [10] S. P. Felperin, P. Raghavan, and E. Upfal, “A theory of wormhole routing in parallel computers,” IEEE Trans. Comput., vol. 45, no. 6, pp. 704–713, 1996. [11] S. Badrouchi, A. Zitoumi, K. Torki, and R. Tourki, “Asynchronous NoC router design,” J. Comput. Sci., vol. 1, no. 3, pp. 429–436, 2005. [12] L. A. Plana, S. B. Furber, S. Temple, M. Khan, Y. Shi, J. Wu, and S. Yang, “A GALS infrastructure for a massively parallel multiprocessor,” IEEE Design Test Comput., vol. 24, no. 5, pp. 454–463, Sep./Oct. 2007. [13] T. Sharp, C. Patterson, and S. Furber, “Distributed configuration of massively-parallel simulation on SpiNNaker neuromorphic hardware,” in Proc. Int. Joint Conf. Neural Netw., San Jose, CA, USA, 2011, pp. 1099–1105. [14] R. Serrano-Gotarredona, M. Oster, P. Lichtsteiner, A. Linares-Barranco, R. Paz-Vicente, F. Gómez-Rodríguez, L. Camuñas-Mesa, R. Berner, M. Rivas-Pérez, T. Delbrück, S. Liu, R. Douglas, P. Häfliger, G. JiménezMoreno, A. C. Ballcels, T. Serrano-Gotarredona, A. J. Acosta-Jiménez, and B. Linares-Barranco, “CAVIAR: A 45k neuron, 5M synapse, 12G connects/s AER hardware sensory–processing–learning–actuating system for high-speed visual object recognition and tracking,” IEEE Trans. Neural Netw., vol. 20, no. 9, pp. 1417–1438, Sep. 2009. [15] J. P. Pfister and W. Gerstner, “Triplets of spikes in a model of spike timing dependent plasticity,” J. Neurosci., vol. 26, no. 38, pp. 9673–9682, 2006. [16] J. Wu, “A router for massively-parallel neural simulation,” Ph.D dissertation, Dept. Comput. Sci., Univ. Manchester, Manchester, U.K., 2010. [17] N. Srinivasa and J. Cruz-Albrecht, “Neuromorphic adaptive plastic scalable electronics: Analog learning systems,” IEEE Pulse, vol. 3, no. 1, pp. 51–56, Jan./Feb. 2012. [18] J. Bailey and D. Hammerstrom, “Why VLSI implementation of associative VLCNs require connection multiplexing,” in Proc. IEEE Int. Conf. Neural Netw., San Diego, CA, USA, Jul. 1988, pp. 173–180. [19] D. B. Strukov and K. K. Likharev, “Prospects for terabit-scale nanoelectronic memories,” Nanotechnology, vol. 16, no. 1, pp. 137–148, 2005. [20] S. Jo, T. Chang, I. Ebong, B. Bhavitavya, P. Mazumder, and W. Lu, “Nanoscale memristor device as synapse in neuromorphic systems,” Nano Lett., vol. 10, no. 4, pp. 1297–1301, 2010. [21] J. Arthur, P. Merolla, F. Akopyan, R. Alvarez, A. Cassidy, S. Chandra, S. Esser, N. Imam, W. Risk, D. Rubin, R. Manohar, and D. Modha, “Building block of a programmable neuromorphic substrate: A digital neurosynaptic core,” in Proc. Int. Joint Conf. Neural Netw., Brisbane, Australia, Jun. 2012, pp. 1–8. [22] N. Imam, T. A. Cleland, R. Manohar, P. A. Merolla, J. V. Arthur, F. Akopyan, and D. S. Modha, “Implementation of olfactory bulb glomerular-layer computations in a digital neurosynaptic core,” Front. Neurosci., vol. 6, p. 83, Jun. 2012, doi: 10.3389/fnins.2012.00083. [23] K. Minkovich, N. Srinivasa, J. M. Cruz-Albrecht, Y. K. Cho, and A. Nogin, “Programming time-multiplexed reconfigurable hardware using a scalable neuromorphic compiler,” IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 6, pp. 889–901, Jun. 2012. [24] P. A. Linares-Barranco, G. Jimenez-Moreno, B. Linares-Barranco, and A. Civit-Ballcels, “On algorithmic rate-coded AER generation,” IEEE Trans. Neural Netw., vol. 17, no. 3, pp. 771–788, May 2006. [25] H. Rohling, T. May, K. Bruninghaus, and R. Grunheid, “Broad-band OFDM radio transmission for multimedia applications,” Proc. IEEE, vol. 87, no. 10, pp. 1778–1789, Oct. 1999. [26] R. Van Nee and R. Prasad, OFDM for Wireless Multimedia Communication. Norwood, MA, USA: Artech House, 2000. [27] M. Jiang and L. Hanzo, “Multiuser MIMO-OFDM for next-generation wireless systems,” Proc. IEEE, vol. 95, no. 7, pp. 1430–1469, Jul. 2007. [28] O. Font-Bach, N. Bartzoudis, A. Pascual-Iserte, and D. Lopez Bueno, “A real-time MIMO-OFDM mobile WiMAX receiver: Architecture, design and FPGA implementation,” Comput. Netw., vol. 55, no. 16, pp. 3634–3647, 2011. [29] J. G. Andrews, A. Ghosh, and R. Muhamed, Fundamentals of WiMAX: Understanding Broadband Wireless Networking. Englewood Cliffs, NJ, USA: Prentice-Hall, 2007. [30] S. Nanda, R. Walton, J. Ketchum, M. Wallace, and S. Howard, “A high performance MIMO OFDM wireless LAN,” IEEE Commun. Mag., vol. 43, no. 2, pp. 101–109, Feb. 2005.


[31] J. Armstrong, “OFDM for optical communications,” J. Lightw. Technol., vol. 27, no. 3, pp. 189–204, Feb. 1, 2009. [32] S. Chen, Q. Yang, Y. Ma, and W. Shieh, “Real-time multi-gigabit receiver for coherent optical MIMO-OFDM signals,” J. Lightw. Technol., vol. 27, no. 16, pp. 3699–3704, Aug. 15, 2009. [33] B. Li, J. Huang, S. Zhou, K. Ball, M. Stojanovic, L. Freitag, and P. Willett, “MIMO-OFDM for high rate underwater acoustic communications,” IEEE J. Ocean. Eng., vol. 34, no. 4, pp. 634–644, Oct. 2009. [34] C. Sturm and W. Wiesbeck, “Waveform design and signal processing aspects for fusion of wireless communications and radar sensing,” Proc. IEEE, vol. 99, no. 7, pp. 1236–1258, Jul. 2011. [35] G. Cherubini, E. Eleftheriou, and S. Olcer, “Filtered multitone modulation for very high speed digital subscriber lines,” IEEE J. Sel. Areas Commun., vol. 20, no. 5, pp. 1016–1028, Jun. 2002. [36] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cambridge, U.K.: Cambridge Univ. Press, 2005. [37] J. Cruz-Albrecht, M. Yung, and N. Srinivasa, “Energy-efficient, neuron, synapse and STDP integrated circuits,” IEEE Trans. Biomed. Circuits Syst., vol. 6, no. 3, pp. 246–256, Jun. 2012. [38] H. Markram, J. Lubke, M. Frotscher, and B. Sakmann, “Regulation of synaptic efficacy by coincidence of post-synaptic APs and EPSPs,” Science, vol. 275, pp. 213–215, Jan. 1997. [39] G. Q. Bi and M. Poo, “Activity-induced synaptic modifications in hippocampal culture: Dependence on spike timing, synaptic strength and cell type,” J. Neuroscience, vol. 18, no. 24, pp. 10464–10472, 1998. [40] J. C. Magee and D. Johnston, “A synaptically controlled, associative signal for Hebbian plasticity in hippocampal neurons,” Science, vol. 275, pp. 209–213, Jan. 1997. [41] C. Y. Yao, “The design of square-root-raise-cosine FIR filter by an iterative technique,” IEICE Trans. Fundam., vol. E90-A, no. 1, pp. 241–248, 2007. [42] F. J. Harris, Multirate Signal Processing for Communication Systems. Englewood Cliffs, NJ, USA: Prentice-Hall, 2004. [43] K. Minkovich, C. M. Thibeault, A. Nogin, Y. K. Cho, M. J. O’Brian, and N. Srinivasa, “HRLSim: A high performance spiking neural network simulator for GPGPU clusters,” IEEE Trans. Neural Netw. Learn. Syst., doi: 10.1109/TNNLS.2013.2276056. [44] J. F. Mitchell, K. A. Sundberg, and J. H. Reynolds, “Spatial attention decorrelates intrinsic activity fluctuations in macaque area v4,” Neuron, vol. 63, no. 6, pp. 879–888, 2009. [45] (2005). AD9510—1.2 GHz Clock Distribution IC, Analog Devices Datasheet [Online]. Available: http://www.alldatasheet.com/datasheetpdf/pdf/166983/AD/AD9510.html [46] M. Morelli, C. C. J. Kuo, and M. Pun, “Synchronization techniques for orthogonal frequency division multiple access (OFDMA): A tutorial review,” Proc. IEEE, vol. 95, no. 7, pp. 1394–1427, Jul. 2007. [47] P. E. Hart, N. J. Nilsson, and B. Raphael, “A formal basis for the heuristic determination of minimum cost paths,” IEEE Trans. Syst. Sci. Cybern., vol. 4, no. 2, pp. 100–107, Jul. 1968. [48] J. Kremkow, A. Kumar, S. Rotter, and A. Aertsen, “Emergence of population synchrony in a layered network of the cat visual cortex,” Neurocomputing, vol. 70, nos. 10–12, pp. 2069–2073, Jun. 2007. [49] L. Benini and G. D. Micheli, “Networks on chips: A new SoC paradigm,” Computer, vol. 35, no. 1, pp. 70–78, 2002. [50] D. Atienza, F. Angiolini, S. Murali, A. Pullini, L. Benini, and G. D. Micheli, “Network-on-chip design and synthesis network,” Integr., VLSI J., vol. 41, no. 3, pp. 340–359, 2008. [51] E. A. M. Klumperink, R. Kreienkamp, T. Ellermeyer, and U. Langmann, “Transmission lines in CMOS: An explorative study,” in Proc. ProRISC, 12th Annu. Workshop Circuits, Syst., Signal Process., Veldhoven, The Netherlands, Nov. 2001 [Online]. Available: http://doc.utwente.nl/67426/ [52] S. H. Lai, S. F. Lei, C. L. Chang, C. C. Lin, and C. H. Luo, “Low computational complexity, low power, and low area design for the implementation of recursive DFT and IDFT algorithms,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 56, no. 12, pp. 921–925, Dec. 2009. [53] S. Lee, B. Jagannathan, S. Narasimha, A. Chou, N. Zamdmer, J. Johnson, R. Williams. L. Wagner, K. Jonghae, J. O. Plouchart, J. Pekarik, S. Springer, and G. Freeman, “Record RF performance of 45-nm SOI CMOS technology,” in Proc. IEEE IEDM, Dec. 2007, pp. 255–258. [54] H. Yin, Z. Wang, L. Ke, and J. Wang, “Monobit digital receivers: Design, performance, and application to impulse radio,” IEEE Trans. Commun., vol. 58, no. 6, pp. 1695–1704, Jun. 2010. [55] A. A. Ghouwayel and Y. Louet, “FPGA implementation of a reconfigurable FFT for multi-standard systems in software radio context,” IEEE Trans. Consumer Electron., vol. 55, no. 2, pp. 950–958, May 2009. [56] A. Goldsmith, Wireless Communications. New York, NY, USA: Cambridge Univ. Press, 2005.

607

[57] E. M. Izhikevich and G. M. Edelman, “Large-scale model of mammalian thalamorcortical systems,” Proc. Nat. Acad. Sci., vol. 105, no. 9, pp. 3593–3598, 2008. [58] J. M. Cruz-Albrecht, T. Derosier, and N. Srinivasa, “Scalable neural chip with synaptic electronics using CMOS-integrated memristors,” Nanotechnology, Special Issue on Synaptic Electronics, vol. 24, pp. 384011-1–384011-11, 2013, doi: 10.1088/0957-4484/24/38/384011. [59] K. Boahen, “A burst-mode word-serial address-event link-I: Transmitter design,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 51, no. 7, pp. 1269–1280, Jul. 2004. [60] K. Boahen, “A burst-mode word-serial address-event link-II: Receiver design,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 51, no. 7, pp. 1281–1291, Jul. 2004. [61] M. Mahowald, An Analog VLSI System for Stereoscopic Vision. Norwell, MA, USA: Kluwer, 1994. [62] M. Sivilotti, “Wiring considerations in analog VLSI systems, with application to field-programmable networks,” Ph.D. dissertation, Dept. Comput. Sci., California Inst. Technol., Pasadena, CA, USA, 1991. [63] W. J. Dally and B. Towles, Principles and Practices of Interconnection Networks. San Francisco, CA, USA: Morgan Kauffman, 2004. [64] P. M. Merolla, J. V. Arthur, B. E. Shi, and K. A. Boahen, “Expandable networks for neuromorphic chips,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 54, no. 2, pp. 301–311, Feb. 2007. [65] E. Ozalevli and C. M. Higgins, “Reconfigurable biologically inspired visual motion systems using modular neuromorphic VLSI chips,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52, no. 1, pp. 79–92, Jan. 2005.

Narayan Srinivasa (M’00–SM’12) received the Ph.D. degree in mechanical engineering from the University of Florida, Gainesville, FL, USA, in 1994. He is a Principal Research Scientist and Manager for the Center for Neural and Emergent Systems, Information and System Sciences Department, HRL Laboratories LLC, Malibu, CA, USA. He was a Beckman Fellow with the Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, IL, USA, from 1994 to 1997. He is currently the Program Manager and Principal Investigator for three DARPA projects. The DARPA SyNAPSE and Physical Intelligence Programs attempt to develop a theoretical foundation inspired by brain science and physics to engineer electronic systems that exhibit intelligence. The DARPA UPSIDE program seeks to develop emerging device technologies for highly energy efficient implementations for real-world applications including image processing. At HRL, he has managed several projects for GM and Boeing, solving real-world problems in the areas of sensing and control. He has published 83 technical papers and holds 32 U.S. patents. Dr. Srinivasa received numerous awards, including the HRL Distinguished Inventor Award, GM Most Valuable Colleague Award, HRL Outstanding Team Award and the HRL Chairman Award. He is a member of INNS and AAAS.

Deying Zhang received the B.S.E. and M.S.E. degrees of material science and engineering from Zhejiang University, Hangzhou, China, in 1991 and 1994, respectively, and the Ph.D. degree in physics and chemistry from the Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou, China, in 1998. He became an Assistant Professor and Associate Professor with the Department of Electronic Science and Physics, Fuzhou University, Fuzhou, in 1998 and 2000, respectively. He joined the research group in the Department of Electrical and Computer Engineering, University of California at San Diego, San Diego, CA, USA, in 2000, where he invented the fluidic zoom lens techniques for UAV in DARPA BOSS project. In 2006, he became a co-founder with Rhevision Technology Inc., San Diego, to develop fluidic zoom lens systems in camera phones. He has been a Research Staff Member with HRL Laboratories, LLC, Malibu, CA, USA, since 2008. He has authored or co-authored 42 technical papers and holds three U.S. patents. His current research interests include signal processing, control, and SoC for radar, communication, and neuromorphic systems.

608


Beayna Grigorian received the B.S. degree in computer science and engineering and the M.S. degree in computer science from the University of California at Los Angeles (UCLA), Los Angeles, CA, USA, in 2010 and 2012, respectively, where she is currently pursuing the Ph.D. degree. She was with the Graduate School, UCLA. She has worked on various techniques for hardware acceleration of applications in the vision, navigation,

and artificial intelligence domains, and is currently researching neuralnetwork-based accelerators. Her current research interests include real-time, low-power systems integrated with smart sensors, autonomous capabilities, and augmented reality interfaces to allow for high levels of situational awareness. Ms. Grigorian received the NSF Graduate Research Fellowship in 2011.

Programming time-multiplexed reconfigurable hardware using a scalable neuromorphic compiler.

Synaptic variability in a cortical neuromorphic circuit.

Silicon synaptic transistor for hardware-based spiking neural network and neuromorphic system.

Towards a neuromorphic vestibular system.

A Divide-and-Conquer Method for Scalable Robust Multitask Learning.

Editorial: Synaptic Plasticity for Neuromorphic Systems.

Robust pedestrian detection by combining visible and thermal infrared cameras.

wireless MIMO system integrated with optical subcarrier multiplexing and 2x2 wireless communication.

A customizable, scalable scheduling and reporting system.

A FPGA-Based, Granularity-Variable Neuromorphic Processor and Its Application in a MIMO Real-Time Control System.

Stochastic learning in oxide binary synaptic device for neuromorphic computing.

A Versatile and Scalable Approach toward Robust Superhydrophobic Porous Materials with Excellent Absorbency and Flame Retardancy.

Gigabit polarization division multiplexing in visible light communication.

Time domain multiplexed spatial division multiplexing receiver.

A comprehensive and scalable database search system for metaproteomics.

Combining a Multi-Agent System and Communication Middleware for Smart Home Control: A Universal Control Platform Architecture.

Memristor crossbar-based neuromorphic computing system: a case study.

Tunable low energy, compact and high performance neuromorphic circuit for spike-based synaptic plasticity.

Time multiplexing super resolution using a Barker-based array.

Regenerative memory in time-delayed neuromorphic photonic resonators.

A neuromorphic implementation of multiple spike-timing synaptic plasticity rules for large-scale neural networks.

Activity-dependent synaptic plasticity of a chalcogenide electronic synapse for neuromorphic systems.

Toward a neuromorphic microphone.

Scalable Spatial-Spectral Multiplexing of Single-Virus Detection Using Multimode Interference Waveguides.