IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 4, APRIL 2012

565

Neural Learning Circuits Utilizing Nano-Crystalline Silicon Transistors and Memristors Kurtis D. Cantley, Member, IEEE, Anand Subramaniam, Student Member, IEEE, Harvey J. Stiegler, Member, IEEE, Richard A. Chapman, Fellow, IEEE, and Eric M. Vogel, Senior Member, IEEE

Abstract— Properties of neural circuits are demonstrated via SPICE simulations and their applications are discussed. The neuron and synapse subcircuits include ambipolar nano-crystalline silicon transistor and memristor device models based on measured data. Neuron circuit characteristics and the Hebbian synaptic learning rule are shown to be similar to biology. Changes in the average firing rate learning rule depending on various circuit parameters are also presented. The subcircuits are then connected into larger neural networks that demonstrate fundamental properties including associative learning and pulse coincidence detection. Learned extraction of a fundamental frequency component from noisy inputs is demonstrated. It is then shown that if the fundamental sinusoid of one neuron input is out of phase with the rest, its synaptic connection changes differently than the others. Such behavior indicates that the system can learn to detect which signals are important in the general population, and that there is a spike-timing-dependent component of the learning mechanism. Finally, future circuit design and considerations are discussed, including requirements for the memristive device. Index Terms— Hebbian learning, memristor, nano-crystalline silicon, neuromorphic, SPICE, thin-film transistor.

I. I NTRODUCTION

T

HE use of nanoscale memristive devices as synapses in artificial neural circuits has been the subject of a great deal of recent research. Induced by the demonstration of TiO2 resistive switches by HP Labs [1], a major focus has been on implementing a pair-based spike-timing-dependent plasticity learning mechanism [2]. To date, most solutions involve pulse width or height modulation [3], [4], require extensive circuitry, and fail to mimic neurobiological systems. A more plausible proposal uses specific shapes for bidirectional action potentials to obtain the desired temporally asymmetric characteristics [5]. Manuscript received October 31, 2011; revised December 13, 2011; accepted January 1, 2012. Date of publication February 3, 2012; date of current version March 6, 2012. This work was supported in part by the SRC/NRI Southwest Academy for Nanoelectronics, the Texas Analog Center of Excellence, and the Texas Emerging Technology Fund. The work of K. D. Cantley was supported by the NDSEG Fellowship. K. D. Cantley, A. Subramaniam, H. J. Stiegler, and R. A. Chapman are with the Department of Electrical Engineering, University of Texas at Dallas, Richardson, TX 75080 USA (e-mail: [email protected]; [email protected]; harvey.stiegler@ verizon.net; [email protected]). E. M. Vogel is with the School of Materials Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2012.2184801

And although emulation of part of the visual cortex has been reported with this method [6], it is unclear from the device density and power dissipation viewpoints whether a brain-scale system is achievable. In fact, designing layouts to realize the necessary metrics may prove difficult for all silicon complementary metal-oxide semiconductor (CMOS) neuromorphic systems due to their 2-D nature. The networks described in this paper feature a Hebbian learning mechanism [7] and operate similarly to biological networks [8]. They employ compact neuron and synapse subcircuits composed of devices based on low-temperature deposited materials. This feature would enable direct extension to dense large-scale 3-D networks with almost arbitrary connectivity. Such a configuration could have a similar physical structure to the neocortex with direct interconnection of a similar number of components (∼1010 neurons [9] and 1014 synapses [10]). High-speed digital circuits and address-event communication schemes [11] would not be necessary. The properties of the neuron and synapse subcircuits and a brief description of circuit operation are presented in Section II. Also, simulated spiking frequency as a function of injection current is shown to be similar to biological data. In Section III, the average firing rate learning rule is calculated for two different values of the synaptic weight, and different values of the synaptic depression voltage Vdep . The calculation is performed by measuring the synaptic weight change when two neuron subcircuits are connected, each with a different DC offset of the noisy injection current (which results in different average spiking frequencies). Analysis of the emergent properties of the circuits is presented in Section IV. First, the ability of the network to adapt and evolve as a function of various stimuli is demonstrated through associative learning. The experiment that is carried out is fundamental classical conditioning using two input neurons. Next, the detection of two and three coincident input action potentials is demonstrated. The ability of neurons to detect coincident events is important for various processes including temporal pattern recognition. Finally, extraction of a common fundamental frequency from a set of noisy inputs uses the phase and coincidence detection capabilities of the circuits. Using a network with eight afferent neurons and one output, it is shown that the network is able to detect when an input signal is out of phase with the rest. These are the first demonstrated applications using device-level simulations without silicon CMOS.

2162–237X/$31.00 © 2012 IEEE

566

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 4, APRIL 2012

Fig. 2. (a) Circuit diagram of two neurons connected together via the synapse. (b) Excitatory post-synaptic potential (EPSP) at the input to N2 for various synaptic weights. (c) Buildup of EPSPs causes the post-synaptic neuron to fire when the potential at Vcap reaches the M3/M4 inverter threshold.

Fig. 1. (a) Neuron circuit schematic including values of various parameters. (b) Synapse subcircuit. All transistors have channel W /L = 100 nm/100 nm except M4, M6, M8, and Minj , which have W = 300 nm. (c) Applied input current Iin with Gaussian noise produces output spikes at Vout . (d) ISI distribution of this output is shown. The inset compares spiking frequency versus input current for this circuit and measured biological data from a neuron before and after spike frequency adaptation [19].

A more detailed description of circuit operation and future design considerations is the subject of Section V. The use of other memristive device models in the synapse subcircuit is also discussed. The conclusions of this paper are then presented in Section VI. II. S PIKING N EURON AND S YNAPSE S UBCIRCUITS In previous work [12], [13], the original leaky integrateand-fire (I&F) neuron circuit proposed by Mead [14] was used. An extension of that design containing several important modifications has been developed here, as shown in Fig. 1(a). The changes allow improved control over circuit behavior and synaptic learning and compensate for the characteristics of the ambipolar nano-crystalline silicon (nc-Si) transistors. Specifically, the charge leakage from the input capacitor C1 is controlled separately by M1 and M2 during a spike and by M9 and M10 otherwise. There is not just a single leakage path for both situations. The transistor Mfeed at the input node acts to turn off all inputs during an action potential and send a voltage spike of magnitude set by Vdep to all the input synapses. This eliminates variations in the width of individual spikes that could be caused by changing excitations in the dentritic arbor. Fig. 1(b) illustrates the synapse configuration [connected as in Fig. 2(a)]. This design is also distinct from previous work in which the synapse consisted of one drive transistor in

series with the memristor [13]. In that configuration, synaptic depression was more difficult to initiate because the transistor was controlled by the normally off pre-synaptic neuron. Firing in the post-synaptic neuron had to force appreciable reverse current to change the synaptic weight, represented by the memristor state variable w/D (see [1] for definition). Here, the drive transistor Minj is moved to the output of each neuron and applies the voltage Vinj to all the output synapses during an action potential. Given that neurons in the brain form up to 10 000 connections [9], [10], [15], the drive transistor must be capable of delivering a significant current through that number of memristive devices. If necessary, this can be accomplished by increasing the device width. As before, the synapse consists of one transistor and one memristor, but here, the feedback path from the post-synaptic neuron is more effective in altering the synaptic weight and the overall system is more stable. Results of simulating this circuit using HSPICE are shown in Fig. 1(c) and (d). The transistor and memristor device models used for the simulation have been previously described in detail in [12] and [13]. Ambipolar nc-Si TFTs and inverters with very similar characteristics to the model have been fabricated recently [16]–[18], and further discussion on the validity of the memristor model is given in Section V. To test the circuit, an input current Iin is applied to the neuron, which results in voltage spikes at the output node Vout [Fig. 1(c)]. The corresponding inter-spike interval (ISI) histogram shown in Fig. 1(d) is approximately Poisson-distributed, consistent with biological measurements [8]. The input current is composed of points spaced at 10-ms intervals, Gaussian-distributed around the desired DC average of 5 nA with a standard deviation of 0.5 nA. The same standard deviation value is used throughout this paper, except for the fundamental frequency extraction in Section IV-C. When a current ramp is applied to the neuron input, measurement of the frequency results in the f –I plot (also known as a discharge curve) shown in the inset of Fig. 1(d). The simulation result is similar in shape and magnitude to biological measurements obtained from the cat’s neocortex [19]. The two sets of f –I data represent the response before and after spike frequency adaptation [8] (at the first ISI after stimulus and at steady-state). Since this neuron circuit does not

CANTLEY et al.: NEURAL LEARNING CIRCUITS UTILIZING NANO-CRYSTALLINE SILICON TRANSISTORS AND MEMRISTORS

have spike frequency adaptation capability, the figure simply demonstrates that its f –I response is a close approximation to the behavior of real neurons. Following the characterization of a single neuron, the effects of connecting multiple neurons together must be considered. The connection diagram of a two-neuron configuration is shown in Fig. 2(a). An input current pulse is applied to N1 to cause the firing of a single action potential (while Isyn = 0). The EPSP that results at the N2 input is shown in Fig. 2(b) for various synaptic weights. The EPSP shape compares well with biological measurements [8]. However, due to the nc-Si device and circuit characteristics, the magnitude of the EPSP is much larger than in biological systems. The addition of the transistor Mlimit [Fig. 1(a)] was an attempt to reduce its height to more realistic (mV) levels. By biasing that transistor in subthreshold, the input resistance to the neuron is increased, and the EPSP magnitude indeed decreases. At the same time, the added resistance significantly decreases the efficacy of the changing synaptic weights. In other words, differences in EPSP between w/D = 0.8 and 0.2 with small Vlimit are negligible. Thus, Mlimit is kept turned on (Vlimit = 5 V). A more detailed operational description of this portion of the circuit is given in Section V. Integration of the EPSPs due to multiple input spikes is shown in Fig. 2(c), eventually resulting in post-synaptic firing around 42 ms (after six spikes). The stimulus required to initiate a post-synaptic neuron action potential is generally determined by two main parameters, the first being synaptic weight. Stronger weights correspond to smaller instantaneous resistance values of the memristor, and thus more charge injected onto the input capacitor C1. The second variable is the frequency of the input action potentials, which determine how quickly the voltage across the capacitor builds up. This is closely related to the value of the leakage current set by Vleak . If several input neurons are feeding in, all of their EPSPs add up at the feed node of the output neuron. A phase-locking capability is implied by this result [20] that will be explored in Section IV. III. S YNAPTIC L EARNING RULE The synaptic learning rule can be characterized using the same two-neuron configuration as in the previous section [Fig. 2(a)]. Noisy excitation currents [similar to Fig. 1(c)] with various DC offsets are generated using a MATLAB program. These input signals can be applied to both Iin and Isyn (driving the pre- and post-synaptic neurons, respectively). Output spiking distributions similar to that in Fig. 1(d) are produced. However, the distribution is shifted depending on the DC offset of the signal. Much larger DC current values will produce higher spiking frequency, and thus smaller ISIs. A one-second transient HSPICE simulation is run for many different pairs of input excitation currents. The MATLAB program then loads the HSPICE output and automatically measures the desired quantities. These include the average firing rates of N1 and N2 (< f N1 > and < f N2 >), and the slope of w/D. The former quantities are calculated by taking the inverse of the number of spikes that occur in the onesecond simulation. They are not determined by plotting the

567

Fig. 3. Synaptic learning rule is the temporal change in synaptic weight as a function of the average firing rates of the pre- (< f N 1 >) and post-synaptic (< f N 2 >) neurons. (a) and (b) show the pseudocolor and line plots of the data for initial synaptic weight w/D = 0.5 and Vdep = 0.5 V. (c) and (d) show the same, but for initial synaptic weight w/D = 0.2. Finally, (e) and (f) are again for initial w/D = 0.5, but with stronger depression by setting Vdep = 1 V.

frequency versus time and averaging. The temporal change in w/D is determined via a least squares linear fit. The resulting data are 3-D. A more detailed discussion on the calculation of these average firing rate learning rules, including the plots of the neuron output and changes in synaptic weights as a function of time, can be found in [13]. Once the HSPICE simulations are complete and the data are merged, pseudocolor plots can be generated by MATLAB. Using linear interpolation, the x–y grid (corresponding to the average firing rates of N1 and N2) is refined for a smoother look. For initial synaptic weights w/D = 0.5 and 0.2 (Rinit = 25 M and 40 M), these data are presented in Fig. 3(a) and (c), respectively. Then, the value of Vdep is increased from 0.5 to 1 V, increasing the strength of synaptic depression (with initial w/D = 0.5). The resulting learning rule is shown in Fig. 3(e). Note that the color scales are the same in all three plots. The same data can be presented in a different format by plotting the synaptic weight change as a function of the difference in average firing rates. These plots are shown in Fig. 3(b), (d), and (f) and each contains four curves corresponding to specific values of < f N1 >. It is clear from the figures that the overall amount of weight change is much lower in the case with lower initial synaptic weight. However, if the color scale is altered such that its range extends from the minimum to maximum data points, Fig. 3(c) looks very similar to 3(a), indicating that the shape of the learning rule does not significantly change for different weight values. However, there is a more pronounced change in the shape of the learning rule when the strength of depression is changed by increasing Vdep . Thus, the stable points in the

568

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 4, APRIL 2012

Fig. 5. Scatter plots of the firing times for all neurons in a network consisting of three input neurons feeding one output neuron. With properly initialized synaptic weights, the network is capable of detecting (a) two or (b) three coincident action potentials.

Fig. 4. Demonstration of associative learning that is analogous to Pavlov’s experiments on classical conditioning in dogs. In the three-neuron network configuration shown at the bottom right, increased spiking activity of N A represents sight of food, N B ringing of a bell, and Nout the salivation response.

learning rule curve, where the spiking is such that the weight does not change over time, can be adjusted by changing Vdep . For that reason, the voltage should essentially be considered an adjustable parameter that may be altered depending on the specific application. IV. N EURAL N ETWORK P ROPERTIES Given the neuron circuit properties and synaptic learning rule, it is a natural progression to explore the consequences of unsupervised learning in larger neural networks. The concepts presented in this section relate to natural neural network behavior, but may also extend to more general signal processing applications. A. Associative Learning A three-neuron (two inputs one output) network was simulated to test the learning mechanisms in a classical conditioning experiment [21]. It is analogous to the seminal research done by Pavlov with salivation response in dogs. Shown in Fig. 4, increased activity (frequency of approximately 100 Hz) of the input neurons N A and N B represent the unconditioned stimulus (sight of food) and the conditioned stimulus (ringing of bell), respectively. The firing rate of these neurons is controlled by noisy input currents that switch between 0- and 7-nA DC offsets. Rapid firing of the output Nout represents the response (salivation). Initially, the excitation of N B (with weak connection to Nout ) does not result in a response, while firing of N A (with strong connection to Nout ) does. This stage is labeled ‘Probing’ in Fig. 4. After a period of simultaneous conditioning (labeled ‘Learning’), the bell ringing alone results in the conditioned response of salivation. The value of Vdep used was 0.5 V. A similar experiment was performed in [22] and [23] but with microcontroller-based neurons and synapses. The neuron

and synapse interactions in that work are less realistic than here, in that the action potentials and overall connectivity do not directly mimic those in biology. Also, it should be pointed out that the association as presented in Fig. 4 can be unlearned. This will happen when the food is continuously presented without the bell (N A is active while N B is not). B. Coincidence Detection Another feature of the neuron circuit is its ability to detect temporal pulse coincidences from multiple input neurons. This attribute is also present in biology [24]. Fig. 5 shows spike times of a four-neuron network (three afferent inputs feeding one output) on a scatter plot. By appropriately initializing the synaptic weights, the output can detect either two [Fig. 5(a)] or three [Fig. 5(b)] coincident events. Extending this capability to many more input neurons is possible, and is in fact the basis for the results presented in the following subsection. It may also be the foundation for other functions such as sound localization in the auditory system [20], [25]. C. Fundamental Frequency Recognition This application utilizes both the coincidence and phase detection capabilities of the neuron circuit. Phase detection refers to the higher firing rate when a periodic excitation current is of larger magnitude. For example, a purely sinusoidal injection current with a DC offset would result in spikes which are generally phase-locked to the sinusoid peak [20]. But if the input signals are noisy, possibly with different frequency components, many input neurons can be used to filter out the noise and extract the fundamental. This concept is similar to brain waves measured by electroencephalography (EEG), where synchrony of action potentials between large neuron populations result in coherent frequency bands [21]. It is also similar to the stochastic resonance observed in a variety of physical systems [26]. The network in this subsection consists of eight afferent neurons N A through N H , feeding one output neuron Nout [see inset of Fig. 6(e)]. The noisy input signals Iin,A through Iin,H that are applied to each of the afferent neurons are created using MATLAB commands. First, a 10-Hz sinusoid with 4-nA peak-to-peak amplitude and 2-nA offset is generated, and a small frequency modulation is added. Each signal has a frequency modulation index of 0.2, but has a random modulation frequency between 0 and 5 Hz. Then, 6 dB of

CANTLEY et al.: NEURAL LEARNING CIRCUITS UTILIZING NANO-CRYSTALLINE SILICON TRANSISTORS AND MEMRISTORS

569

Fig. 7. (a) Synaptic weight of the out-of-phase neuron N A increases compared to the rest. ISI histogram of (b) neuron A, and (c) output neuron final 30 s, analogous to Fig. 6(c) and (d).

Fig. 6. Network with eight afferent neurons feeding one output is able to extract a 10-Hz fundamental frequency by filtering noisy input signals. (a) Example of one of the input signals (Iin,A ) is shown along with the resulting voltage output (Vout ) of the afferent neuron N A . For clarity, only the first two seconds out of the 60 s simulation are shown. (b) Output voltage of the output neuron Nout for the last two seconds of the simulation. (c) and (d) ISI histograms for these same two signals. (e) Finally, the change in synaptic weight as a function of time for initial w/D = 0.5. The inset has similar features even though the initial synaptic weights were randomly distributed.

white Gaussian noise is added. One example of an input signal and the resulting output voltage spikes is shown in Fig. 6(a) (for N A ). Its specific modulation frequency is 2.2 Hz. The total length of the generated signal and the transient HSPICE simulation is 60 s, but only the first two seconds are shown. The output voltage of the output neuron is shown in Fig. 6(b) for the last two seconds of the simulation, after learning has occurred. ISI histograms shown in Fig. 6(c) and (d) illustrate the filtering capability of this network. Most of the afferent neurons are generating relatively high frequency spikes, due to the noise in the input signals. When all the synaptic weights are initialized to w/D = 0.5, the output neuron is similar at first. However, synaptic learning begins to take place, and the weights decrease rapidly and begin to saturate at a constant level (Vdep = 1.3 V). Thus, the ISI distribution

of Fig. 6(d) only includes the last 30 s of the simulation. The synaptic weight change over the entire duration of the simulation for this case is shown in Fig. 6(e). The network is learning that there is a fundamental component to the inputs, and the synaptic weights saturate at approximately w/D = 0.1. For a given number of afferent neurons, the saturation level is determined by several factors. Generally, they are the amount of current injected during a pre-synaptic action potential and the threshold of the input inverter. More specifically, the injection and depression voltages Vinj and Vdep , the resistance of the memristive device in the synapse, and the value of the input capacitor all play a role. When there are fewer afferent neurons, the saturation level increases, but the inputs must have larger signal-to-noise ratio (SNR) in order to be detected by the network. In these cases, it was observed that when the synaptic weights are initially set low, say w/D = 0.2, they all increase to the stable saturation level. Also, the synaptic weights can be set randomly, as shown in the inset of Fig. 6(e). Almost regardless of the initial values, the weights change to a common stable level. The addition of many more afferent neurons, or of multiple network layers should result in the ability to detect even lower SNR. At this point, however, the design allows for a maximum of approximately 10 afferent neurons. The reason is that for larger numbers, the saturation level will decrease essentially to w/D = 0 so that Rmem = Roff , and information about the nature of the individual input signals will be lost. A more realistic initial condition for the synaptic weights would be a Gaussian distribution around a specific mean. However, that situation gives a similar result to the two cases shown in Fig. 6(e). Related experiments would be to determine how the network responds when one or two of the input signals are different from the rest. Many options are available for altering the input, but the simplest parameter to change is the phase. Thus, the same simulation is performed again, but with the fundamental sinusoid of Iin,A 180° out of phase with the rest.

570

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 4, APRIL 2012

Fig. 9. Input to the capacitor of the post-synaptic neuron circuit (driven by the injection voltage Vinj through the memristive synapse) can be represented by an R-C network. This is useful to examine what parameters affect the EPSP shape and for determining the capacitor values.

Fig. 8. Variations of the synaptic weights during 60 s transient simulations are shown, analogous to Figs. 6(e) and 7(a). The fundamental sinusoid of Iin,A is (a) 90° and (b) 270° out of phase with the population (eight afferents). (c) Iin,A is 90° and Iin,B is 270° out of phase, and the synaptic weight change is similar to (a) and (b). (d) Network also detects the doubling of Iin,A fundamental frequency to 20 Hz.

The result is that the weight of synapse ‘a’ increases instead of decreasing during the simulation, as shown in Fig. 7(a). Essentially, the network is detecting that there is something different about the input to that neuron. Note that the ISI histograms shown in Fig. 7(b) and (c) are analogous and very similar to those in Fig. 6(c) and (d). This indicates two things about the network behavior that are extremely important. The first is that the network can learn to detect signals which are different from the general population without supervision. This is significant because those signals contain important information that is absent in the rest of the neurons. Secondly, the results demonstrate that the learning rule is clearly not based on average firing rates alone. The average firing rate of N A is the same as N B through N H . The only difference is the phase shift, which indicates the learning rule has a very important temporal component. Other signal variations such as 90° and 270° phase shifts produced qualitatively similar results, but different changes in the synaptic weight of ‘a’, as shown in Fig. 8(a) and (b), respectively. When these two phase shifts are applied to Iin,A and Iin,B , both synaptic weights change similarly to when only one signal is altered [Fig. 8(c)]. Increasing the input frequency to 20 Hz also resulted in increased weight, as shown in Fig. 8(d). Many more variations can be explored to further investigate the parallels with biology. For instance, the 10-Hz fundamental frequency used here lies within the α-band in EEG measurements. Variations in the oscillation magnitude may relate to short-term memory storage [27], and phase correlations with the γ -band (30+ Hz) vary with attentionrelated cognitive tasks [28]. V. D ISCUSSION Two important design aspects for these neural circuits are discussed in this section. First, a more detailed analysis of the connection between two neurons via the synapse is provided.

This includes explanation of how the EPSP arises in the postsynaptic neuron, and how the resistance of the memristive device and the size of the capacitors determine its behavior. Then, the EPSPs and synaptic learning rule using a different memristor model are presented. A comparison demonstrates that different circuit configurations and device properties will produce similar results. A. Circuit Design Considerations The main goals for eventual implementation of this system are decreased EPSP magnitude and input capacitor size. These changes will result in more realistic circuit behavior and increased density. Both can be accomplished by reducing the amount of current injected by a synapse at each presynaptic spike. Equivalently, the injection voltage Vinj should be small, and the resistance of the synapse large to avoid the use of large capacitors. A simplified version of the circuit from the output of the pre-synaptic neuron (Vinj), through the memristive synapse (Rmem ), to the input of the post-synaptic neuron is shown in Fig. 9. Each device involved is modeled as a simple resistor (except the capacitor Ceq representing the sum of C1 and C2 ). By straightforward analysis, when the switch (Minj ) closes at the start of a pre-synaptic action potential, the capacitor is being charged with time constant Ceq Rleak (Rmem + Rlimit ) . Rleak + Rmem + Rlimit During discharge, the time constant for decay is simply τcharge =

τdischarge = Ceq Rleak

(1)

(2)

and the magnitude of the EPSP changes because of the τ charge dependence on Rmem . Of course, reducing the voltage Vlimit to increase Rlimit will also reduce the EPSP magnitude. At the same time, it will make changes in Rmem less effective since the two are in series. The bias voltage Vleak determines Rleak , which generally controls how quickly the EPSP decays. In the circuit of Fig. 1(a), reduction of the injection voltage Vinj is not a viable option, because the threshold of the M3/M4 inverter is set at approximately 1.6 V. Thus, Vinj must be greater than this value for the capacitor voltage Vcap to become large enough to switch that inverter. Some amplification of the capacitor voltage may be necessary in future implementations (see Section V-B). Regardless, if the on-resistance Ron of the memristive device is too small, large Ceq will still be required so that the output neuron is not overstimulated. If too much current is injected during each action potential, important

CANTLEY et al.: NEURAL LEARNING CIRCUITS UTILIZING NANO-CRYSTALLINE SILICON TRANSISTORS AND MEMRISTORS

Fig. 10. (a) Ideal voltage source is inserted into the circuit of Fig. 1(a) that amplifies the signal at Vcap . With other changes in the circuit parameters, the discharge curve produced is similar to Fig. 1(d). (b) EPSP for different synaptic weights using the circuit parameters in (a) and the memristor model described in [35]. (c) Average firing rate Hebbian learning rule is similar to before, with a clear transition from potentiation to depression at < f N 1 >= < f N 2 > that can be seen in (d).

timing and firing rate information will be lost at the output neuron when the number of afferents is large. To summarize, the limiting factor that will determine the density of these circuits in the future is the size of the capacitors. Smaller capacitors can be used if the amount of injection current from the dendritic synapses is reduced. This can be accomplished by decreasing Vinj and increasing the absolute resistance of the memristive device. The on/offresistance ratio Ron /Roff is probably not as important here as in digital memory systems. Also, these design considerations are not limited to the neuron circuit presented here, but are common to many circuits that use an input capacitor to represent the membrane and integrate the incoming synaptic currents [29]–[31]. For comparison, the total membrane capacitance of a typical cortical neuron can be calculated as approximately 11 pF, on average [32]. B. Memristor Model Measurements of real memristive devices contain nonlinearities not incorporated into the current device model that merit investigation. Sharp switching which may take the form of the hyperbolic sine function is one example [33], [34]. Also, asymmetry between potentiation and depression is typically observed [33]. In these circuits, however, control over the amount of potentiation or depression is obtained by generally adjustable bias voltages Vinj and Vdep , respectively. Of course, the value of the injection voltage is not entirely adjustable, as described in the previous section, but will eventually be chosen based on the specific characteristics of the memristive device, the maximum tolerable capacitance, and possibly the properties of the M3/M4 inverter in the neuron circuit. Determining the extent to which the properties of the memristive device matter is the subject of this section. To begin, the SPICE model of [35] was implemented, which has similar characteristics to the devices reported in [34]. Because

571

of the very low on-resistance Ron , the neuron circuit of Fig. 1 required modification. The input to the M3/M4 inverter (gate connections to M3 and M4) were disconnected from the node Vcap and connected to an ideal controlled voltage source. The value of the source is 10 times the voltage across the capacitor Vcap , so it essentially acts as a voltage amplifier. Other circuit connections were kept the same, but the bias voltages and the capacitor values were altered as in Fig. 10(a), which displays the resulting discharge curve. With all these changes, the spiking characteristics are very similar to the previous design. Next, two neurons were connected together [see Fig. 2(a)] to measure the EPSP (Vcap ) for various synaptic weights, as shown in Fig. 10(b). Note in this figure that the EPSP magnitude is approximately 0.16 V maximum, due to the reduction of Vinj to 0.18 V. It still reaches the trip point of the M3/M4 inverter because of the amplifying voltage source. For small w/D, the EPSP magnitude is approximately 50% of the maximum (when w/D is large). Following examination of the EPSPs which result from this configuration, the full average firing rate Hebbian learning rule was calculated. The pseudocolor plot is shown in Fig. 10(c), with the line plots in Fig. 10(d). In this case, the same protocol was used as for Fig. 3, except the standard deviation of the noise was increased to 0.75 nA. Also, the initial synaptic weight was set to w/D = 0.5 and Vdep and Vleak were set as shown in Fig. 10(a). Although it is much noisier, the general behavior is analogous to the previous simulations in Fig. 3 and [13]. Specifically, there is strong potentiation when < f N1 > is slightly greater than < f N2 >, but weaker potentiation when it is much greater. As before, depression occurs in the opposite case (when < f N2 is greater than < f N1 >). This result indicates that proper circuit design can overcome large variations in the memristive device characteristics to obtain the correct spiking and learning behavior. VI. C ONCLUSION HSPICE simulations show that neural circuits comprised memristors and ambipolar nano-crystalline silicon transistors exhibit characteristics similar to biology. The subcircuit building blocks are scalable to networks with many neurons, and can perform various important functions depending only on their connectivity and the synaptic strengths. Due to area constraints, compact circuits with dense integration capability such as those shown here will be essential for human brainscale networks and systems. A discussion of what device metrics are necessary to enable dense circuit fabrication was also provided. In addition, it was shown that these networks are capable of unsupervised learning tasks much like biological systems. The clearest example is that of classical conditioning and associative learning. The network also demonstrates the ability to detect coincident action potentials, which has many uses, including the learned extraction of common fundamental frequency components in noisy input signals. A related task is the learned recognition of signals which are not coherent with the general population. This is especially important since the detected signal contains important information that is not available in the rest of the inputs. This kind of synchrony

572

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 23, NO. 4, APRIL 2012

has many analogies in biological systems. Finally, the specific detection of out-of-phase signals as well as signals with different frequencies is indicative of a spike-timing-dependent component in the synaptic learning rule. R EFERENCES [1] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams, “The missing memristor found,” Nature, vol. 453, pp. 80–83, May 2008. [2] G. Q. Bi and M. M. Poo, “Synaptic modification by correlated activity: Hebb’s postulate revisited,” Annu. Rev. Neurosci., vol. 24, pp. 139–166, Mar. 2001. [3] G. S. Snider, “Spike-timing-dependent learning in memristive nanodevices,” in Proc. NANOARCH, Jun. 2008, pp. 85–92. [4] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder, and W. Lu, “Nanoscale memristor device as synapse in neuromorphic systems,” Nano Lett., vol. 10, no. 4, pp. 1297–1301, Mar. 2010. [5] B. Linares-Barranco and T. Serrano-Gotarredona, “Exploiting memristance in adaptive asynchronous spiking neuromorphic nanotechnology systems,” in Proc. 9th IEEE Conf. Nanotechnol., Genoa, Italy, Jul. 2009, pp. 601–604. [6] C. Zamarreño-Ramos, L. Camuñas-Mesa, J. A. Pérez-Carrasco, T. Masquelier, T. Serrano-Gotarredona, and B. Linares-Barranco, “On spike-timing-dependent-plasticity, memristive devices, and building a self-learning visual cortex,” Front. Neurosci., vol. 5, no. 26, pp. 1–36, 2011. [7] D. O. Hebb, The Organization of Behavior: A Neuropsychological Theory. New York: Wiley, 1949. [8] C. Koch, Biophysics of Computation. New York: Oxford Univ. Press, 1999. [9] B. Pakkenberg and H. J. G. Gundersen, “Neocortical neuron number in humans: Effect of sex and age,” J. Comparat. Neurol., vol. 384, no. 2, pp. 312–320, Jul. 1997. [10] Y. Tang, J. R. Nyengaard, D. M. G. DeGroot, and H. J. G. Gundersen, “Total regional and global number of synapses in the human brain neocortex,” Synapse, vol. 41, no. 3, pp. 258–273, Sep. 2001. [11] K. A. Boahen, “Point-to-point connectivity between neuromorphic chips using address events,” IEEE Trans. Circuits Syst. II: Analog Digital Signal Process., vol. 47, no. 5, pp. 416–434, May 2000. [12] K. D. Cantley, A. Subramaniam, H. J. Stiegler, R. A. Chapman, and E. M. Vogel, “SPICE simulation of nanoscale non-crystalline silicon TFTs in spiking neuron circuits,” in Proc. 53rd IEEE Int. Midwest Symp. Circuits Syst., Seattle, WA, Aug. 2010, pp. 1202–1205. [13] K. D. Cantley, A. Subramaniam, H. Stiegler, R. Chapman, and E. Vogel, “Hebbian learning in spiking neural networks with nanocrystalline silicon TFTs and memristive synapses,” IEEE Trans. Nanotechnol., vol. 10, no. 5, pp. 1066–1073, Sep. 2011. [14] C. Mead, Analog VLSI and Neural Systems. Reading, MA: AddisonWesley, 1989. [15] J. Hawkins and S. Blakeslee, On Intelligence. New York: Henry Holt, 2004. [16] A. Subramaniam, K. D. Cantley, R. A. Chapman, B. Chakrabarti, and E. M. Vogel, “Ambipolar nano-crystalline-silicon TFTs with submicron dimensions and reduced threshold voltage shift,” in Proc. 69th Annu. Device Res. Conf. Dig., Santa Barbara, CA, Jun. 2011, pp. 99–100. [17] A. Subramaniam, K. D. Cantley, R. A. Chapman, H. J. Stiegler, and E. M. Vogel, “Submicron ambipolar nanocrystalline-silicon TFTs with high-κ gate dielectrics,” in Proc. Int. Semicond. Device Res. Symp., College Park, MD, 2011, pp. 1–2. [18] A. Subramaniam, K. D. Cantley, H. J. Stiegler, R. A. Chapman, and E. M. Vogel, “Submicron ambipolar nanocrystalline silicon thin-film transistors and inverters,” IEEE Trans. Electron Devices, vol. 59, no. 2, pp. 359–366, Dec. 2011. [19] C. E. Stafstrom, P. C. Schwindt, and W. E. Crill, “Repetitive firing in layer V neurons from cat neocortex in vitro,” J. Neurophysiol., vol. 52, no. 2, pp. 264–277, Aug. 1984. [20] W. Maass and C. M. Bishop, Pulsed Neural Networks. Cambridge, MA: MIT Press, 1999. [21] E. R. Kandel, J. H. Schwartz, and T. M. Jessell, Principles of Neural Science, 4th ed. New York: McGraw-Hill, 2000. [22] Y. V. Pershin and M. D. Ventra, “Experimental demonstration of associative memory with memristive neural networks,” Neural Netw., vol. 23, no. 7, pp. 881–886, Sep. 2010. [23] Y. V. Pershin and M. Di Ventra, “Neuromorphic, digital and quantum computation with memory circuit elements,” Proc. IEEE, 2011, DOI: 10.1109/JPROC.2011.2166369.

[24] P. König, A. K. Engel, and W. Singer, “Integrator or coincidence detector? The role of the cortical neuron revisited,” Trends Neurosci., vol. 19, no. 4, pp. 130–137, Apr. 1996. [25] P. X. Joris, P. H. Smith, and T. C. T. Yin, “Coincidence detection in the auditory system: 50 years after Jeffress,” Neuron, vol. 21, no. 6, pp. 1235–1238, 1998. [26] K. Wiesenfeld and F. Moss, “Stochastic resonance and the benefits of noise: From ice ages to crayfish and SQUIDs,” Nature, vol. 373, no. 6509, pp. 33–36, 1995. [27] O. Jensen, J. Gelfand, J. Kounios, and J. E. Lisman, “Oscillations in the alpha band (9–12 Hz) increase with memory load during retention in a short-term memory task,” Cerebral Cortex, vol. 12, no. 8, pp. 877–882, Aug. 2002. [28] J. M. Palva, S. Palva, and K. Kaila, “Phase synchrony among neuronal oscillations in the human cortex,” J. Neurosci., vol. 25, no. 15, pp. 3962– 3972, Apr. 2005. [29] G. Indiveri, E. Chicca, and R. Douglas, “A VLSI array of low-power spiking neurons and bistable synapses with spike-timing dependent plasticity,” IEEE Trans. Neural Netw., vol. 17, no. 1, pp. 211–221, Jan. 2006. [30] A. van Schaik, “Building blocks for electronic spiking neural networks,” Neural Netw., vol. 14, nos. 6–7, pp. 617–628, 2001. [31] E. Chicca, D. Badoni, V. Dante, M. D’Andreagiovanni, G. Salina, L. Carota, S. Fusi, and P. Del Giudice, “A VLSI recurrent network of integrate-and-fire neurons connected by plastic synapses with long-term memory,” IEEE Trans. Neural Netw., vol. 14, no. 5, pp. 1297–1307, Sep. 2003. [32] L. J. Gentet, G. J. Stuart, and J. D. Clements, “Direct measurement of specific membrane capacitance in neurons,” Biophys. J., vol. 79, no. 1, pp. 314–320, 2000. [33] Y. V. Pershin and M. D. Ventra, “Memory effects in complex materials and nanoscale systems,” Adv. Phys., vol. 60, no. 2, pp. 145–227, 2011. [34] J. J. Yang, M. D. Pickett, X. Li, D. A. A. Ohlberg, D. R. Stewart, and R. S. Williams, “Memristive switching mechanism for metal/oxide/metal nanodevices,” Nature Nanotechnol., vol. 3, no. 7, pp. 429–433, 2008. [35] E. Lehtonen and M. Laiho, “CNN using memristors for neighborhood connections,” in Proc. 12th Int. Workshop Cellular Nanoscale Netw. Appl., Feb. 2010, pp. 1–4.

Kurtis D. Cantley (S’02–M’07) received the B.S.E.E. degree from Washington State University, Pullman, in 2005, where he worked in the National Security Internship Program at Pacific Northwest National Laboratory, the M.S.E.E. degree from Purdue University, West Lafayette, IN, in 2007, where his research involved simulation of III–V materials in nanoscale transistors, and the Ph.D. degree in electrical engineering from the University of Texas at Dallas, Dallas, in 2011, with funding from the National Defense Science and Engineering Graduate Fellowship. His current research interests include nanoscale devices and materials for implementing artificial neural circuits.

Anand Subramaniam (S’05) received the B.Tech. degree in electronics and communications from the National Institute of Technology, Calicut, India, in 2007. He has been pursuing the Ph.D. degree in electrical engineering with the University of Texas at Dallas, Dallas, since August 2009. His current research interests include neuromorphic circuit applications, spike-timing based plasticity, and synaptic learning rules.

CANTLEY et al.: NEURAL LEARNING CIRCUITS UTILIZING NANO-CRYSTALLINE SILICON TRANSISTORS AND MEMRISTORS

Harvey J. Stiegler (S’70–M’80) received the B.S.E.E. degree from Texas Tech University, Lubbock, and the M.S. and Ph.D. degrees from Rice University, Houston, TX, in 1973, 1985, and 1989, respectively. He served in the U.S. Air Force until joining Texas Instruments (TI) Inc., Dallas, in 1979. He was involved in the development of nonvolatile memory products and dynamic random access memories as well as analog and mixed signal products as a Design Engineer and Design Manager at TI. In 2009, he joined the University of Texas at Dallas, Dallas, as a Research Scientist in materials science and engineering. Dr. Stiegler was an elected Senior Member and Technical Staff with TI in 1993. He is a member of the American Physical Society and the American Association for the Advancement of Science.

Richard A. Chapman (M’78–SM’93–F’98) received the B.A., M.A., and Ph.D. degrees in physics from Rice University, Houston, TX, in 1954, 1955, and 1957, respectively. He was with General Electric Corporation, Fairfield, CT, for two years. He joined Texas Instruments (TI) Inc., Dallas, in 1959, specializing in scaling of CMOS transistors. From 1999 to 2007, he was a Consultant at TI. Since 2008, he has been a Research Scientist with the University of Texas at Dallas, Dallas. Dr. Chapman was the co-recipient of the IEEE Jack Morton Award for his earlier work on HgCdTe Charge Transfer IR Sensor Arrays in 1987. He is a fellow of the American Physical Society. He was the General Chairman of the IEEE Symposium on VLSI Technology after having been the Program Chairman in 1996, Secretary, and Local Arrangements Chairman.

573

Eric M. Vogel (M’93–SM’03) received the B.S. degree in electrical engineering from Penn State University, University Park, and the Ph.D. degree in electrical engineering from North Carolina State University, Raleigh, in 1994 and 1998, respectively. He joined the National Institute of Standards and Technology, Gaithersburg, MD, becoming a Leader of the CMOS and Novel Devices Group in 2001 and the Founding Director of the NIST NanoFab in 2003. He was an Associate Professor of materials science and engineering and electrical engineering with the University of Texas at Dallas, Dallas, from August 2006 to August 2011. He is currently a Professor of materials science and engineering with the Georgia Institute of Technology, Atlanta. He has published over 110 archival publications, five book chapters, and over 50 invited talks and tutorials. His current research interests include materials and devices for future electronics.

Neural learning circuits utilizing nano-crystalline silicon transistors and memristors.

Properties of neural circuits are demonstrated via SPICE simulations and their applications are discussed. The neuron and synapse subcircuits include ...
3MB Sizes 0 Downloads 4 Views