LETTER

Communicated by Dan Goodman

Toward Unified Hybrid Simulation Techniques for Spiking Neural Networks Michiel D’Haene [email protected]

Michiel Hermans [email protected]

Benjamin Schrauwen [email protected] ELIS Department, Ghent University, 9000 Ghent, Belgium

In the field of neural network simulation techniques, the common conception is that spiking neural network simulators can be divided in two categories: time-step-based and event-driven methods. In this letter, we look at state-of-the art simulation techniques in both categories and show that a clear distinction between both methods is increasingly difficult to define. In an attempt to improve the weak points of each simulation method, ideas of the alternative method are, sometimes unknowingly, incorporated in the simulation engine. Clearly the ideal simulation method is a mix of both methods. We formulate the key properties of such an efficient and generally applicable hybrid approach.

1 Introduction A biological neural network is a densely connected massively parallel computing system, where neurons can be connected to each other with tens of thousands of physical “wires” in a three-dimensional fashion. We are not able to reproduce such a dense and rich physical computing system. Instead, we simulate these networks using more conventional architectures. A large range of hardware architectures can be used to simulate spiking neural networks, from general-purpose computing platforms to specialized analog or digital hardware implementations. In this letter, we concentrate on simulation techniques on generalpurpose computing platforms, and to some extent graphical processing units (GPU), which are by far the most often used computing platforms due to their low cost, wide availability, and flexibility. However, from an architectural point of view, a GPU is also one of the least suited platforms: spiking neural networks are in essence a massively parallel computing system, while general-purpose computing platforms consist of a relatively small number of fast but single threaded processing units. As Neural Computation 26, 1055–1079 (2014) doi:10.1162/NECO_a_00587

c 2014 Massachusetts Institute of Technology 

1056

M. D’Haene, M. Hermans, and B. Schrauwen

such, the choice and efficiency of the simulation algorithm have huge impacts on the simulation speed. A wide range of simulation tools for spiking neural networks (SNNs) on general-purpose computers is available today. An overview of some of them is given in Brette and et al. (2007), but many others exist. All of these implementations can be divided into two simulation schemes: time-stepbased and event-driven simulation. In the next section, we take a closer look at both and discuss their shortcomings regarding the simulation of SNNs. One of the remarkable observations when comparing both simulation methods is that they are largely complementary; a weak point of one method is often a strong point in the other. Several improvements have been proposed and implemented to improve the overall efficiency and accuracy of the simulation tool. We elaborate on these improvements in the next section. Due to the above observation, most of these improvements tend to (sometimes unknowingly) incorporate techniques of the other method. The purpose of this work is twofold. First, we provide a review of both event-driven and time-step-based simulation strategies. Next, we use this overview to synthesize a road map for a generally applicable hybrid simulation strategy that is both time step based and event driven and can fully exploit the strong points of both methods. Finally, we discuss how this hybrid method can benefit from the computing power of large computing clusters and GPUs, two architectures becoming increasingly popular due to their performance benefit compared to single general-purpose computers.

2 Overview of Advancements in Simulation Methods In order to emulate a spiking neuron, a model of the neuron has to be created that describes how the state variables of the neuron, such as the membrane voltage or synaptic conductances, evolve. This model takes the form of a set of differential equations, which can be evaluated analytically or numerically. Typically the model tries to mimic the behavior of a real biological neuron up to a certain modeling accuracy. The complexity of the neuron model varies depending on the purpose of the simulation. For example, detailed studies of the brain require models that mimic a biological neuron as close as possible (Markram, 2006). These models can be complex and require a lot of computational resources for their evaluation. In the field of neurocomputing, neurons are used for their computational possibilities. Detailed biologically realistic neuron models tend to be much more difficult to understand on a circuit or a network level, which makes it hard to take computational advantage of the rich dynamics these models create. As a result, simple and understandable models with a lower computational load are often preferred. A wide range of neuron models has already been described and studied for this purpose. We refer to some of these models later, but a detailed overview is beyond the scope of this letter.

Toward Unified Hybrid Simulation Techniques for SNNs

1057

SNNs can be simulated using two possible simulation schemes: timestep-based simulation and event-driven simulation. A time-step-based or clock-driven simulator performs an evaluation of the neuron model at regular predetermined time intervals. This simulation method is also referred to as a synchronous method because all neurons in the system are evaluated at the same time stamp. They offer a generally applicable method that can be used for arbitrary neuron models. The first simulators developed for SNN simulations were simple timegrid-constrained time-step-based simulators. With these simulators, time steps are fixed in size, the spikes and interconnection delays are discretized according to the time resolution, and all neurons in the system are updated synchronously at each time step. Due to their simplicity and wide applicability, they are still widely used, and in many studies for which such an approach will suffice—for instance, if the goal is to assess high-level dynamical properties of SNNs. Nevertheless, when detailed biological properties are modeled, highly accurate spike times may be required, and time-stepbased methods may suffer from computational inefficiency in this case:

r

r

There is an increasingly large body of evidence that in biological neurons, the exact spike timing matters and carries information (Reich, Mechler, & Victor, 2001; de Ruyter van Steveninck & Laughlin, 1996). It has been shown that neurons can produce spike trains with submillisecond accuracy in vitro (Mainen & Sejnowski, 1995). Also, learning rules such as synaptic plasticity depend on the relative timing of presynaptic and postsynaptic spikes (Abbott & Nelson, 2000). As such, since the time resolution depends on the time step size, small time steps are required in order to reach adequate accuracy (Softky, 1995). Recently the question has been raised as to how fine the resolution of the spike timing should actually be. Cessac and Vi´eville (2008) and Kirst and Timme (2009) suggested that the main issue is not how fine the resolution actually is in models or data, but whether there is a time discretization at all. SNNs are usually characterized by low activity—much lower than the rate at which the neuron is updated. As a result, most of the time, the simulator is constantly updating every neuron in the system at a very fine timescale (in order to achieve adequately accurate spike timings), while no perceivable output is generated by the neuron.

The frequent updates of each neuron model, which are not related to the actual dynamics that occur in the model, imply a heavy and inefficient load on the computational resources. Moreover, the time resolution is often not adequate for studies that require very high accuracy unless excessively small time steps are used. As a result, new simulation strategies have been developed with the aim of improving both the efficiency and the accuracy of the simulation.

1058

M. D’Haene, M. Hermans, and B. Schrauwen

In event-driven simulation schemes, updates of each neuron in the system are triggered only when neurons receive or emit a spike. This method is also called an asynchronous method, because each neuron is updated independent of the other neurons. At first sight, they offer a very efficient simulation methodology, especially in networks with sparse activity, which can be combined with accurate simulation results. However, the nature of SNNs poses some specific problems that need to be solved before this method becomes useful for large-scale SNNs with a wide range of neuron models. 2.1 Advances in Event-Driven Simulation Strategies. A large improvement of the simulation efficiency was reached with the introduction of event-driven simulators for SNN simulations, which seem to have the ideal properties: each neuron in the system is asynchronously processed only when activity occurs, exact simulation results can be obtained, and update operations are performed only when necessary, that is, when abrupt changes in the neuron dynamics are induced. In contrast to a time-step-based simulator, the time between two neuron model updates directly depends on the activity in the model. As such, it is not fixed or known beforehand. For each update, the state variables of the neuron are updated from the previous update time to the current time. This is quite simple when an analytical expression of the neuron model is available. The first event-driven simulators for SNNs supported only very simple analytically solvable neuron models, such as leaky integrate-and-fire (LIF) models with direct interconnections between the neurons. Its model is given by the following equation: τm

du(t) = −(u(t) − urest ) + RI(t), dt

(2.1)

with u(t) the membrane voltage, R the membrane resistance, τm the membrane time constant (equal to RC with C being the membrane capacitance), urest the rest potential, and I(t) the input current. Spikes are represented by Dirac pulses. In this simple network, the Dirac pulses are fed directly into the receiving neurons, causing instantaneous jumps in the membrane potential. Between two spikes, the membrane equation can be analytically solved for u(t): u (t) = urest + (u (0) − urest )e−t/τm .

(2.2)

A spike is emitted when u(t) reaches a certain threshold. Since the function is a monotonous decreasing function between two incoming spikes, this can happen only synchronously at the time of incoming spikes. Highly efficient event-driven implementations for these models have been presented (e.g., Watts, 1993; Delorme, 2001), which work well for

Toward Unified Hybrid Simulation Techniques for SNNs

1059

small to medium-sized networks. However, when event-driven simulators are scaled to larger networks, one of the issues that has to be tackled is the event queue, which needs to keep track of the next smallest event of all outstanding events to be processed each time an event has been added or processed. Usually event-driven simulators implement a data structure that keeps the event list sorted and allows efficient insertion and deletion. An often used data structure satisfying these conditions is the heap queue (Sedgewick, 1992), which requires O(log(E)) sorting time when an event is inserted or deleted, with E the number of elements in the queue. For large networks with a lot of activity, the queue can be very large, generating an unacceptable overhead to the simulation. As a result, these simple eventdriven simulators obtain high efficiency only when the queuing overhead is small in comparison to the event processing time, which is the case only for relatively small networks. 2.1.1 Event Queueing Strategies. In large networks, each neuron is connected to a large number of destination neurons (more than 1000). In a naive queueing approach, each time a neuron fires, spikes are inserted in the event queue for each of the destination neurons. For an event-driven simulation of a neural network with N neurons, the average number of events in the queue is then given by E = p · F · N · l,

(2.3)

with p the average number of outgoing connections per neuron, F the average firing rate of each neuron, and l the average life span of each event. The latter is simply the average delay of the interconnections. A large improvement was proposed in Mattia and Del Giudice (2000) by introducing a restriction on the number of possible interconnection delays in the network. For each possible delay di , a separate event list is created. Since events are processed in their correct time order and all interconnection delays within each list are the same, the list can be implemented as a simple first-in, first-out (FIFO) list, which has O(1) sorting time. The eventprocessing unit now only has to determine the smallest event across the topelements in each list. For a network with D different interconnection delay, the maximum size of the event queue is thus limited to E = D, independent of the number of events in the system. For many simulations, restricting the number of possible interconnection delays is not acceptable, and thus other methods have to be used. A significant improvement to the naive approach, implemented in the simulator MVASpike from Rochel (2004), takes benefit of the fact that only the event with the smallest time stamp needs to be known by the system. In this approach, when the network is created, a local list is stored for each neuron containing the outgoing connections of each neuron sorted by their

1060

M. D’Haene, M. Hermans, and B. Schrauwen

interconnection delay. When a neuron emits a spike, only a single event is inserted in the global event queue corresponding to the destination neuron with the smallest interconnection delay. When this event is processed, it is replaced by the next destination neuron in its local list. This is repeated until all outgoing interconnections are processed. Since the local sorted list is computed at creation time, it introduces no computational overhead during the simulation. The average number of events in the event queue is now reduced to E = F · N · lmax .

(2.4)

Here, lmax denotes the average maximum life span of an event, that is, the longest interconnection delay. In the case of a uniformly distributed interconnection delay, lmax = 2l. The already reduced queueing time can be further reduced using alternative sorting methods. Marian, Reilly, and Mackey (2002), proposed a method based on time windows. New events are simply dropped (unsorted) in a pool. At regular time intervals, all events in the pool smaller than a certain time window Twindow are selected and sorted (using, e.g., quicksort). Since only a small number n  E of events have to be sorted, this significantly reduces the sorting time. Nevertheless, a disadvantage of this pool-based technique is the need to cycle regularly through the complete list of events in order to select the next set of events between t and t + Twindow . A refinement on this method is to divide the single pool in days, each having the size of the window Twindow . Newly generated events are directly placed into the correct day, and selecting the next time window just means switching to the next day. This sorting mechanism is called the unsorted bucket calendar queue (UCQ). It offers O(1) insertion time but guaranteed O(Nsublist ) dequeueing time. A more popular approach is the sorted bucket calendar queue (SCQ), which stores events immediately in order within each day. It thus has a O(Nsublist ) insertion time and a O(1) dequeue time. But since sorting starts from Nsublist = 1, it requires on average a lower number of operations. This type of calendar queue has been described (Brown, 1988). In Brown’s algorithm, the size of the days also adapts at runtime to the number of elements that has to be sorted. The best performance is obtained when the day size is chosen such that each day contains on average only three to four elements. It was shown by experimental data that the SCQ has true O(1) insertion and dequeueing time for events, independent of the number of events in the queue, even when the queue becomes very large (Brown, 1988). One possible issue with this calendar queue algorithm is that the sizebased resizing trigger is not very efficient, and resizing can be computationally costly, especially when the number of events in the queue frequently show large fluctuations. The reason is that all events need to be reinserted

Toward Unified Hybrid Simulation Techniques for SNNs

1061

again each time the day size is changed. Also, the queueing process is less efficient if the event distribution is heavily skewed, so that some days contain much more events than other days. An algorithm that copes with these problems is the ladder queue, presented in Tang, Goh, and Thng (2005). It is different from the SCQ in the sense that events are sorted only when absolutely necessary, resulting in O(1) insertion time, and resizing operations are triggered more efficiently while the resizing itself affects only a part of the event queue, significantly reducing the cost of resizing. As a result, it outperforms an SCQ in cases with heavily skewed event distribution and when the number of events in the queue show large fluctuations. An important remark on these day-based scheduling mechanisms is that they introduce time-step-based aspects in the event-driven environment. Indeed, each day can be interpreted as a global time step. The moment of switching from one day to the next day is determined by the size of each day, which does not directly depend on the events in the system. 2.1.2 Firing Time Prediction Events. From the above, it is clear that eventdriven simulation can combine both high efficiency with high accuracy, even in large-scale neural networks. However, we used simple neuron models with the assumption that firing occurs only instantaneously, that is, the firing threshold is crossed at the arrival of an incoming spike. This is the case for the above model as outgoing spikes are modeled as Dirac pulses and cause abrupt changes in the membrane potential of the destination neurons. In more realistic neural (network) models, the shape of a spike changes while it travels along the dendrite and especially when it passes the synaptic cleft. As a result, a spike arriving at a destination neuron is no longer a Diraclike pulse but rather a continuous function. Instead of an instantaneous jump in the membrane potential upon the arrival of a spike, it will now have a rise time until it reaches a maximum. The addition of synaptic dynamics to the network model not only improves the biological realism; it was also shown in Maass (1996) and Maass and Markram (2002) that synaptic models significantly increase the computational expressive power of neurons. As a simple example, we consider the same LIF neuron model but add synaptic dynamics modeled as an exponentially decaying function. Between two spikes, the output of synapse i evolves according to the following differential equation: τs,i

dIi (t) = −Ii (t), dt

(2.5)

where τs,i represents the time constant of input connection i. The arrival of a new spike causes an instantaneous jump in the synaptic current I(t). The LIF neuron model with exponentially decaying synaptic currents (described by

1062

M. D’Haene, M. Hermans, and B. Schrauwen

equations 2.1 and 2.5) is a linear differential system and has an analytical solution in intervals with no spikes, governed by u(t) = urest + (u(0) − urest )e−t/τm   τs,i  −t/τ +R Ii (0) e s,i − e−t/τm . τs,i − τm i∈

(2.6)

An important property of this system is the delayed firing that now occurs: the firing threshold of a neuron will be reached some time after the arrival of input spikes. This property has important consequences regarding the event-driven simulation of these models. First, the firing events need to be executed in their correct causal order and thus need to be sorted in time. These firing events have different properties from the incoming spike events: their number is fixed (one per neuron), and they can change in time (i.e., a new incoming spike can change the predicted firing time). Therefore, they are often stored in a separate event queue. This double queue mechanism was described in Lee and Farhat (2001). Furthermore, the sorting overhead of this queue can be minimized by applying the optimizations discussed above. The second issue is more problematic. Since an event-driven simulator evaluates only the neuron on an event, such as when reaching the firing threshold, it has to know beforehand when to schedule this event. This requires the membrane equation to be solved for t, given the threshold voltage as an input. In the case of the neuron model described in equation 2.6, an analytical solution does not exist. Until recently, there was no good solution for this firing time prediction issue. Instead, special neuron models have been introduced for event-driven simulations that have an analytical solution for the firing time or use a customized solution. For example, a solution for an LIF neuron model with synaptic conductances using only a single synaptic time constant was presented in Brette (2006), for a quadratic LIF model in Tonnelier, Belmabrouk, and Martinez (2007) and for a LIF model with multiple synaptic time constants in Brette et al. (2007). Rudolph and Destexhe (2006) introduced the gIF model, which mimics most of the properties of the LIF model with synaptic conductances and also has an analytical solution. Carrillo, Ros, Ortigosa, Barbour, and Agis (2005) and Ros, Carillo, Ortigosa, Barbour, and Ag´ıs (2006) introduced an event-driven scheme based completely on lookup tables for both the evaluation of the state variables and firing time prediction. Arbitrary neuron models can be used with this scheme, but the number of state variables determines the dimension of the lookup table and should be kept as small as possible in order to prevent an excessively large lookup table, which leads to poor cache behavior. Moreover, each neuron in the network with different neuron parameters requires a separate lookup table, which impedes heterogenous network simulations.

Toward Unified Hybrid Simulation Techniques for SNNs

1063

D’Haene, Schrauwen, Van Campenhout, and Stroobandt (2009) introduced a novel solution for the firing time prediction problem. An adapted Newton Raphson (NR) scheme allows iteratively finding the exact firing time for general LIF neuron models with an arbitrary number of exponentially decaying synaptic currents, significantly improving the simulation efficiency. D’Haene and Schrauwen (2010) showed that this efficient event-driven methodology can be extended to a wider range of neuron models—any neuron model that can be described as a sum of exponentials. A broad range of complex neuron models can be described or approximated by such a linear system. Some examples are the double exponential synaptic current model used in Carnevale and Hines (2002) and the generalization to this model described in van Elburg and van Ooyen (2009), the LIF neuron model with voltage-dependent threshold dynamics from Mihalas¸ and Niebur (2009), and the recently introduced gIF4-model (Rudolph-Lilith, Dubois, & Destexhe, 2012).

2.1.3 Removing the Requirement for an Analytical Solution for Event-Driven Simulations. So far, we have shown that event-driven simulation can be applied to large-scale networks and a wide range of neuron models. Moreover, in the case that the neuron model can be described as a sum of exponentials, a methodology can be used that allows efficiently simulating models with a high number of state variables. Such models can be used to mimic the behavior of complex low-order nonlinear models. Despite the wide applicability of the linear models, the expressive power of such models is often not adequate for many physiological modeling studies. For example, the simulation of neuron states with intense activity similar to cortical activity in vivo, such as high-conductance states with fast synaptically driven transient changes in the membrane conductance (Destexhe, Rudolph, & Par´e, 2003), requires a neuron model that includes conductance-based synaptic interactions with the membrane. Typically the system of differential equations describing the neuron model does not have an analytical solution of the membrane potential. In order to obtain the value of the membrane, a numerical solver can be used, which uses some fixed or adaptive time steps, where the size of the steps determines the accuracy of the solution. At first sight, such a scheme does not seem to fit into the idea of an event-driven simulator, where a membrane potential is updated immediately from time ti−1 to ti upon the arrival of a new event. Instead, an analytical solution is required, which allows performing this state update in an exact way using a single function evaluation. However, there is no reason that a numerical solver cannot be used in an event-driven simulation environment. Assume, for example, a neuron that was last updated at time t0 and with the next event being a firing event, occurring at time tf . When the simulator reaches time tf , the neuron needs

1064

M. D’Haene, M. Hermans, and B. Schrauwen

to be updated from t0 to tf . A numerical solver with fixed time step size D can perform this update using a number of consecutive steps t0 , t0 + D, t0 + 2D, . . . , t0 + iD until t f − (t0 + iD) < D. The last step spans a smaller step size, such that the membrane is updated to the exact time at tf . The step size D determines the accuracy of the membrane solution. A cleaner way to implement this idea in an event-driven environment is to perform these time steps automatically as the simulation time evolves. Using the previous example, instead of scheduling tf as the next update event in the queue, we schedule t0 + D in the queue. Thus, the neuron is updated as soon as the simulator reaches simulation time t0 + D. An important advantage of this method is that the simulator does not need to know the exact firing time of the neuron at time t0 . Instead, it needs only to check if t f > t0 + D, which can be done using a quick and rough approximation of the firing time. This is exactly what the adaptive NR method described in D’Haene et al. (2009) does in order to minimize the computational overhead of a firing time prediction: minimizing the number of computations that possibly can be rendered obsolete by future input activity. An interesting alternative for time-step-driven simulation of nonlinear neuron models has been presented in Zheng, Tonnelier, and Martinez (2009), where instead of discretizing time, the membrane potential is subdivided in different regions (which is known as voltage stepping). Each range of the membrane potential approximates the nonlinear behavior of the true model locally by a linear model. As long as the membrane potential remains between two voltage steps, it has an analytical solution, and the transitions between different voltage steps can be incorporated into a fully eventdriven simulation strategy. This strategy has proven to be more accurate and faster than time-step-driven simulations for moderately sized networks (100 neurons) (Kaabi, Tonnelier, & Martinez, 2011). Further research is warranted to find how well this strategy scales to even larger networks. 2.2 Advances in Time-Step-Based Simulation Strategies. Eventdriven simulation is not the only method to improve the simulation efficiency and accuracy of SNNs compared to the first time-step-driven simulators. The realization that event-driven simulation might not be able to simulate complex neuron models, and the initial difficulties with event queue handling and firing time prediction, has also motivated research on better time-step-based simulation methods. In a simple time-gridconstrained approach, accuracy requires the use of small time steps, while simulation speed requires large time steps. The main issue is to improve the accuracy of the simulation while allowing the use of larger time steps. A first improvement was proposed in Hansel, Mato, Meunier, and Neltner (1998) and Shelley and Tao (2001) and has the goal of improving the accuracy of the approximation without requiring the use of smaller

Toward Unified Hybrid Simulation Techniques for SNNs

1065

time steps. Instead of aligning firing time events on the simulation time grid, a linear interpolation of the membrane potential is used between two consecutive time steps in order to have a better approximation of the exact moment when the membrane crosses the threshold. The reset of the neuron dynamics can now be performed at this more accurately estimated “off-grid” firing time, effectively improving the simulation accuracy or allowing the use of larger time steps. Recently it was shown that exact spiking times can also be incorporated in a time-step-driven simulation (Hanuschkin, Kunkel, Helias, Morrison, & Diesmann, 2010). A situation that becomes more apparent when using larger time steps is that multiple spikes for each neuron may arrive within a single time step. In biologically realistic circumstances, the average total rate of spikes arriving at the neuron model may exceed 10 kHz (e.g., 10,000 inputs, each with an average activity of 1 Hz) (Morrison, Straube, Plesser, & Diesmann, 2007); thus time steps larger than 0.1 ms will have an average of more than one spike per time step. This is a problem in a time-grid-constrained approach, because incoming spikes are aligned to the time grid, and the order of arrival of spikes within the same time grid cannot be determined. This introduces two types of errors:

r r

Spikes arriving between two time steps are assumed to arrive at the start of the time step, which introduces time-shift errors. These become worse when the time steps become larger. If excitatory and inhibitory spikes occur within the same time grid, the order of arrival of the spikes can have a large impact on the spike probability. For example, a neuron can miss a threshold crossing if an excitatory spike happens after an inhibitory spike. This is shown in Figure 1a. A solution for this situation could be to prioritize excitatory spikes. This, however, leads to the opposite error in the case that an inhibitory spike occurs after an excitatory spike, as shown in Figure 1b. Without keeping track of the order of appearance of spikes, it is thus not possible to prevent such errors. These errors can be important in, for example, studies of coincidence detections of neurons (Softky, 1994). Some learning algorithms such as STDP also need information on which spike has last arrived at the neuron before it reached the threshold. Since the timing of spikes within the same time grid is lost, this cannot be determined and may cause quick divergence of the simulation results compared to the exact solution (Brette et al., 2007). The errors imposed on STDP by a fixed time grid have been ¨ confirmed in Henker, Partzsch, and Schuffny (2012), who showed that variable time step solutions provide a great increase in accuracy over fixed time step simulations.

In order to eliminate these errors, the simulator should keep track of the order of arrival of the neurons and apply the spikes at the exact time of arrival. Morrison et al. (2007) presented a method for linear neuron

1066

M. D’Haene, M. Hermans, and B. Schrauwen

Figure 1: Spike errors introduced by time-grid constrained simulations. (a) Elimination of a spike event caused by simultaneous occurrence of spikes 3 and 4 in the time grid approach. (b) Creation of a nonexisting spike by giving priority to excitatory spikes.

models such as LIF models with multiple synaptic time constants. Two implementations were presented: the canonical implementation and the prescient implementation. The prescient implementation solves only the first type of error, by changing the influence of an incoming spike such that it has the same impact on the membrane state at the beginning of the update interval as it would have if applied at the exact time. The canonical implementation also solves the second type of error by keeping track of the order of arrival of the spikes using a local event queue for each neuron for the spikes arriving within the imposed time interval. Within each time step, the neuron’s state variables are now updated from spike to spike and, finally, from the last spike to the end of the time step. After each update, the state of the membrane can be checked for a threshold crossing. A similar approach to the canonical implementation was presented in Rangan and Cai (2007) for an LIF neuron with synaptic conductances. It describes the same idea of sorted lists with updates from spike to spike within each

Toward Unified Hybrid Simulation Techniques for SNNs

1067

time step, but since an analytical solution does not exist, an approximation method based on gaussian quadrature is used to estimate the dynamics of the membrane potential. Note that the state variable updates are now event driven, and no longer time step based. The size of the global time step in these “exact” methods is now limited only by the following considerations:

r

r

r

In periods of low spiking activity, the only membrane updates are those triggered by the global time-step generator. The firing time is estimated using an interpolation method between two state consecutive updates. Since no iterative scheme such as NR is applied, the approximation becomes less accurate when the interpolation interval increases. A firing time between two updates is detected by checking the membrane potential at the second update. Clearly a firing event has happened whenever the potential is higher than the threshold at this second update. Conversely, if the membrane is lower than the threshold potential at this second update, it is assumed that the membrane potential was below the firing threshold during the complete interval between the two updates. Unfortunately, this assumption is prone to missing spikes. Since the probability that a spike is missed due to this assumption increases with the size of the update interval, the time-step sizes should be kept sufficiently small. The larger the time step, the larger the number of spikes that will occur within each time step, causing an increase in the size of the local event queue, increasing the sorting time for this queue.

If an exact firing time detection scheme were used (e.g., based on an iterative NR scheme), exact results can be reached independent of the time step size and without missing spikes. In this case, simulation accuracy depends on only the accuracy of the state variable updates of the neuron model. Furthermore, if an analytical solution exists (which is the case for the LIF neuron model used in Morrison et al., 2007), the global time-step-based updates even become completely unnecessary. But then the sorting of the events would become a major problem for the efficiency of the simulation. Note the similarity with day-based queueing algorithms of the previous section. The only difference with the event-driven approach can be found in the behavior on reaching the next global time step or day. In the eventdriven approach, this is just a trigger to switch to the next pool of events, while in the time-step-based approach, the global time steps are also used to perform a synchronous update of all neuron models in the network. The latter can be omitted if a better firing time prediction mechanism is used. 2.2.1 Asynchronous Time-Step-Based Simulation. A last optimization for time-step-based approaches starts from the observation that network activity in many biologically inspired simulations is not constant. In periods

1068

M. D’Haene, M. Hermans, and B. Schrauwen

of low network activity, larger time steps can be used than in periods of high network activity. A simulator that takes advantage of this is the global variable time-step integrator presented in Hines and Carnevale (2001). Nevertheless, the advantage of this approach is limited when applied to large neural networks, since the maximum time-step size depends on the fastestchanging state variable in the complete network. To overcome the poor performance of the global variable time-step method in these circumstances, each neuron in the system should ideally be updated at its own optimal rate. Such a simulator was presented in Lytton and Hines (2005): lvardt. Instead of triggering network updates synchronously for every neuron in the network, each neuron now has its own independent variable time-step integrator. An update for a neuron is triggered by incoming spikes or by the time to the next update determined by the adaptive integration method, depending on which comes first. Since there is no global trigger for neuron updates, but instead each neuron in the system is updated independent of the other neurons, the simulator has to determine which neuron has to be updated next. This is done by an integration coordinator, which in fact is an event queue. Once again, despite the algorithm being presented as an optimization to the time-step-based approach, it is in fact a basic event-driven simulation method with the inclusion of a variable step-size integration method, allowing the simulation of arbitrary neuron models. 3 Hybrid Simulation Method If we compare the latest advancements in both event-driven methods and time-step-based methods, we can conclude only that the difference between both methods has largely vanished. The reason to call the method event driven or, rather, time-step-based is mainly based on the background of what the researchers used as a starting point for their simulator. The fact is that these methods contain elements of both. Therefore, we believe it is more appropriate to call these methods hybrid methods. Based on the optimizations discussed in this letter, we can define the key properties for a highly efficient and generally applicable hybrid method:

r

r

The use of global time steps becomes merely a tool to divide the simulation in small simulation intervals in order to facilitate quick ordering of the spikes in the system. In contrast to simple grid-constrained simulators, they no longer aim to serve as the trigger to update the state variables because the optimal update interval may vary for each neuron and may also vary in time. The size of the global steps should be chosen according to the spiking activity in the network in order to minimize the sorting overhead of the spikes. State variables that do not depend on other state variables can be updated independent of the other state variables. This is, for example,

Toward Unified Hybrid Simulation Techniques for SNNs

r

r

1069

the case for the synaptic state variables in the LIF neuron model. The large speed-up reached for linear models in D’Haene et al. (2009) was based on this principle: the arrival of a spike at one synaptic input does not require updating all synapse models of this neuron, while exact simulation results can still be guaranteed. Firing time predictions are prone to changes on the arrival of new events to the neuron model before reaching this firing time. Therefore, an efficient method should have the property that the fewest possible number of computations is performed for a prediction when it is likely that new events will arrive before this firing event. A generally applicable method that has this property is the safe Newton Rhapson method presented in D’Haene et al. (2009). State variables that can be evaluated analytically can be updated in a purely event-driven fashion, that is, from event to event. State variables that have no analytical solution can be updated using a local time-stepping scheme. An efficient scheme is, for example, the fourth-order Runge-Kutta scheme (Press, Teukolsky, Vetterling, & Flannery, 2009). Additional updates may be required if their value has to be known at a certain time stamp. This is, for example, the case for independent synapse state variables that have not been updated for some time but a membrane state update is triggered, requiring their value at a certain time.

Many researchers are already customized to a simulator, which they have probably extended with their own neuron models, hacks, interfaces, and so on. Therefore, it is not our goal to present an implementation of yet another simulator. Instead, available simulation tools can be improved using the guidelines we have noted. Some modern simulation tools already incorporate some of these ideas and can be easily extended to a fully hybrid simulation tool. 4 Parallelization With the increasing importance of cluster-based computations and GPU acceleration, we cannot introduce a new simulation scheme without discussing parallelization possibilities. In this section, we take a brief look at the possibilities and challenges of computing clusters and GPUs for accelerating the simulation of SNNs based on this hybrid scheme. 4.1 Cluster-Based Parallel Simulation. A neural network can be accelerated using multiple general-purpose computers in parallel, where each computer contains only a part of the SNN. If the SNN contains N neurons and if the computer cluster consists of M identical computing nodes, the SNN can be equally divided over the nodes, each containing N/M neurons. Since the number of neuron updates is reduced proportionally to the

1070

M. D’Haene, M. Hermans, and B. Schrauwen

number of computing nodes, this in principle allows a linear acceleration with the number of nodes. Today’s computer clusters often contain many thousands of computing nodes, which can give rise to a significant speed-up of large-scale neural network simulations. There is, however, a pitfall in this linear speed-up assumption. The neurons in the network have (possibly many) connections to other neurons, which are not necessarily located on the same node. Therefore, each node in the cluster (which we will call a computing node) needs to communicate spikes to the other nodes in the cluster. In order to guarantee that these spikes arrive in time at each computing node, a synchronization mechanism is required. An important issue here is the network latency of the interconnect infrastructure. Typically this latency is larger than the duration of the neural network update computations between two synchronization phases, causing an important drop in simulation speed-up. 4.1.1 Parallel Time-Step-Based Simulation. A traditional time-step-based simulation typically executes in two phases: a neuron update phase and a communication phase. After each update phase, spikes that were generated in the current time step are communicated to the destination neurons. In order for this mechanism to work, the minimum interconnection delay between neurons has to be at least the length of the time step. Otherwise newly generated spikes can arrive at destination neurons within the same time step. When small time steps are used in biologically inspired SNNs, the minimum interconnection delay is often much larger than the time-step size. If the minimum delay between any two neurons in the network is dmin > L · h, with h the time-step size and L an integer number, a spike emitted by a neuron can reach the destination neurons only after L simulation time steps. In a computing cluster, it is therefore sufficient to communicate these spikes in intervals of L time steps, which significantly reduces the overhead caused by the network latency. This mechanism was described in Morrison, Mehring, Geisel, Aertsen, and Diesmann (2005) and also used in the simulator PCSIM (Pecevski, Natschl¨ager, & Schuch, 2009). When L is sufficiently large and the number of neurons on each computer is sufficiently large (such that updating all neurons accounts for the largest part of the computational cost), supralinear speed-up was even reported (Morrison et al., 2005; Pecevski et al., 2009). This is due to the smaller number of neurons located on each node in the cluster when the network is distributed on multiple computers, resulting in more efficient cache use. The discussed method is a simple barrier synchronization method, where the cluster is synchronized with a time interval equal to the minimum interconnection delay. Note that not the minimum interconnection delay between any two neurons is important but only the minimum interconnection delay between any two neurons located on different computing nodes. As such, cluster performance can be optimized by dividing the network

Toward Unified Hybrid Simulation Techniques for SNNs

1071

in subnetworks such that neurons with a small interconnection delay are located on the same computing node. The problem of dividing the network in an optimal way is a classical graph partitioning problem, which is NPcomplete (Shmoys, 1995). Several heuristic methods have been developed, which work well in practice. An example is metis.1 4.1.2 Parallel Hybrid Simulation Techniques. In an event-driven and hybrid environment, a similar mechanism can be used. In this case, each node processes events within the time window t + dmin , followed by a synchronization phase in which all nodes communicate the spikes that have occurred during this execution window. However, this simple method has some additional disadvantages when used in an event-driven or hybrid environment:

r

r

r

In contrast to the time-step-based approach, neurons are now updated only when necessary. As such, nodes with little neuronal activity will execute much faster than nodes with a lot of neuronal activity. At the end of the time window, faster nodes need to wait on the slowest node before proceeding to the next time window. We already explained that the communication phase is an overhead to the simulation, since no neuron update operations are performed during this phase. Since neuron update operations are performed more efficiently than in the time-step-based approach, this overhead will have a larger impact on overall communication efficiency. Since all communication is performed in bursts during the communication phase, the network is not used during the neuron update phase. This results in poor utilization of the available bandwidth.

Instead of a barrier synchronization mechanism, a wide range of asynchronous mechanisms exists, which are better suited for the asynchronous nature of event-driven simulation. These mechanisms do not require a global communication phase during which no update operations can be performed and overcome the problems noted. A simple-to-implement asynchronous simulation mechanism is conservative synchronization (Chandy & Misra, 1979; Fujimoto, 1990). Each computing node updates the neuron models independent of the other nodes, but only if it is guaranteed safe to do so—that is, when no events can arrive at the node with a time stamp smaller than the local simulation time. When this condition is not fulfilled, the node blocks and waits until the other nodes advance in time. A key feature for conservative algorithms is the look-ahead: if a node is at simulation time T and it guarantees that no event will be transmitted with a time stamp smaller than T+L, it is said that the node has a look-ahead of L. In the case of SNNs, this can be the minimum

1 http://glaros.dtc.umn.edu/gkhome/metis/metis/overview.

1072

M. D’Haene, M. Hermans, and B. Schrauwen

delay between neurons, similar to the time window in the case of a barrier synchronization method. But in contrast to a synchronous approach, the execution window at each node is akin to a sliding window, which advances in time based on the minimum time stamp of events that can be received from any of the sending nodes. One of the main issues with conservative synchronization is the problem of having up-to-date information of this minimum time stamp of future events at each node. An easy solution for this was proposed in Chandy and Misra (1979): each node sends—besides the spike events—null messages at regular time intervals. These messages do not contain events, but only provide the receiving nodes of information of the smallest time stamp of future events. The interval at which the null messages are generated should be smaller than the look-ahead of this node. This approach has, for example, been implemented in the distributed event-driven SNN simulators presented in Lobb, Chao, Fujimoto, and Potter (2005) and a variant of this in Mouraud, Puzenat, and Paugam-Moisy (2006). A disadvantage with null messages is that they can make up most of the messages being sent between nodes and thus may cause a significant communication overhead, especially when the look-ahead is relatively small. Many variants on this conservative scheme have been introduced, which minimize the number of null messages in the cluster. One example is the receiver-based approach, in which null messages are transmitted only when the receiver asks for them, that is, when the receiver is getting blocked (Baker, Mahmood, & Carlson, 1996; Fujimoto, 1990). Instead of maintaining a strictly causal execution order of events, there also exist optimistic synchronization mechanisms that allow causality errors to occur. In this mechanism, each processing node simulates as fast as possible, but when it detects a causality error—an event arrives at the node with a time stamp smaller than the local simulation time—some mechanism is provided to repair this error. These methods can better explore the available parallelism because they allow parallel execution of events that may possibly influence each other. Lipton and Mizell (1990) showed that they can arbitrarily outperform conservative synchronization mechanisms. The best-known form of optimistic synchronization is time warp (Jefferson & Sowizral, 1985). When a causality error is detected, the simulator rolls back in time. However, if events have since been transmitted that are, due to this event, no longer correct, antimessages have to be transmitted. These will also trigger roll-back in the remote nodes. As such, a roll-back cycle can propagate through the simulation system, and this may become computationally very expensive. A causality error does not necessarily imply that the simulation results are no longer correct. For example, in the case of SNN simulations, assume that a node receives a spike with a time stamp smaller than the current simulation time. If this spike is received by a neuron that has not been updated between the erroneous time stamp and the current simulation

Toward Unified Hybrid Simulation Techniques for SNNs

1073

time, it can simply be processed. Since neurons have a low probability of generating output, this will likely be the case. But if this spike causes the neuron to reach the spiking threshold before the current simulation time, a spike is emitted locally, causing a new causality error. Several optimizations have been proposed to this optimistic scheme. Most of the techniques restrict the amount of optimism in the system, minimizing the probability of causality errors or minimizing the cost of a roll-back, for example, by using optimistic execution only locally, combined with a more conservative approach for messages transmitted between nodes (Reynolds, 1988). Optimistic synchronization mechanisms have yet to be investigated in the context of parallel simulation of large SNNs. Nevertheless, there are indications that they may be very well suited for this task:

r r

r

The occurrence of a causality error does not necessarily mean that time has to be rolled back. This is the case if the affected neuron was last updated before the erroneous spike. If the affected neuron was last updated after the time stamp of the erroneous spike, the simulator can decide to just apply the spike at the current time stamp. This will cause a small time shift error, but the influence of this error on the network dynamics may be negligible. This kind of error actually occurs all the time in time-step-based simulators. By limiting the optimism of each computing node, this error can be kept within an acceptable range. For linear neuron models, the time shift error can be compensated by adjusting the weight of the incoming spike. This technique was also applied in Morrison et al. (2007) in order to align spikes on the time grid of a time-step-based simulator. In this case, the dynamics of the neuron model remain exact except if the affected neuron has reached the threshold between the time stamp of the erroneous spike and the current simulation time. However, the probability that this situation occurs is low, given the typical low firing rate of neurons in large SNNs.

It is currently still an open issue which of the existing parallel event-driven mechanisms is best suited and how they can improve the simulation of large SNNs. 4.2 Graphical Processing Units. A recently very popular architecture for the simulation of SNNs is the GPU, a specialized processor designed to render 2D and 3D graphics, which require very memory-intensive operations. Thanks to its specialized design, computer graphics can be manipulated efficiently, and the parallel structure of a GPU allows doing this much faster than a general-purpose processor. Current GPUs offer GigaFLOPs performance, which is more than tenfold the processing speed of state-ofthe-art CPUs. Moreover, GPUs performance is increasing at a higher rate

1074

M. D’Haene, M. Hermans, and B. Schrauwen

than CPU performance. Regarding power consumption, a GPU can deliver equivalent performance of a general-purpose processor at a much lower power consumption (Huang, Xiao, & Feng, 2009). The cost per FLOP is also significantly lower in a GPU, typically at one-tenth the cost per FLOP of a general-purpose processor.2 Due to the inherent parallelism of neural networks, and the large memory bandwidth typically required for efficient simulation methods, GPUs seem to be an interesting platform for fast neural network implementations. Nevertheless, their limited flexibility makes it difficult to design algorithms that make optimal use of the hardware. The main limitations of GPUs are the following:

r r

A GPU is mainly a single instruction, multiple data (SIMD) processor. It has a very high data throughput if the same instruction can be applied on multiple data sets at the same time. The memory model is restricted. The cache sizes are an order of magnitude smaller than the cache sizes of a CPU (Govindaraju, Larsen, Gray, & Manocha, 2006). Moreover, high computing performance is possible only if the data are preloaded in the caches. The performance is much worse when data have to be read from the global memory.

As a result, most implementations currently reported are special neural network designs with a highly regular network structure that are well suited for implementation on GPU, e.g. (Dolan & DeSouza, 2009; Nasse, Thurau, & Fink, 2009; Raina, Madhavan, & Ng, 2009). A GPU implementation has been reported for Brian, a time-step-based simulator written in Python (Goodman & Brette, 2008). In order to circumvent the difficulties of converting conventional programs to software able to run on GPU, programming frameworks are also being developed that provide an abstraction layer for special network topologies (Mutch, Knoblich, & Poggio, 2010). Current implementations of SNNs on GPUs have in common that they use a time-step-driven approach (Nageswaran et al., 2009, Richert, Nageswaran, Dutt, & Krichmar, 2011). In this approach, neurons are updated one by one during the update phase. Since the order of neuron updates is fixed, the caching behavior can be easily optimized. In an event-driven and hybrid approach, computations are more optimized, and neurons are updated only when required. As a result, there is no fixed order in which neurons get updated, making it more challenging to fully exploit the SIMD architecture with good caching behavior. In related research fields such as event-based logic gate network simulation, it was already shown that GPUs can significantly accelerate event-driven simulation (Chatterjee, Deorio, & Bertacco, 2011; Wang, Zhu, & Deng, 2010), but further investigation needs to be done for SNN specific simulations. Likely,

2 www.nvidia.com.

Toward Unified Hybrid Simulation Techniques for SNNs

1075

similar challenges will appear when optimizing event-based simulation for multicore systems. If the cache memory cannot be used efficiently due to the random order in which events are processed, this may provide an important bottleneck, which means that the parallel processing power cannot be used efficiently. Solutions for these kinds of problems warrant further studies. 5 Conclusion Both time-step-based and event-driven simulators for spiking neural network simulations have seen a large evolution when compared to the original basic simulation ideas. The first event-driven methods were very promising, showing large speed-ups for some well-chosen neuron models. However, they were hampered by a limited support of neuron models and a scaling problem when applied to larger networks. Introducing the concept of days could solve the scaling problem. Local integration schemes, using timestep and voltage-stepped concepts, solved the problem of limited support of neuron models. But although the first time-step-based simulators could handle arbitrary neuron models, they suffered from low simulation efficiency or accuracy. This was improved by introducing updates between these time steps and taking care of the correct order of spike arrivals within each time step. The latter is accomplished by introducing sorted event lists. The result of this process is that both methods have integrated ideas of the competing method, ending up with two methods that basically converged to a similar solution. In this letter, we have defined the key properties for a hybrid method, uniting both strategies. Based on these properties, existing simulators can be further improved. Finally, we discussed parallelization issues for these hybrid simulation methods on both computing clusters and GPUs. Barrier synchronization methods are commonly used in time-step-driven methods and can also be adopted for hybrid methods. Nevertheless, in order to fully exploit the inherent parallelism of neural networks, conservative and optimistic synchronization methods are likely better suited for this task; this is a field that has yet to be investigated. References Abbott, L., & Nelson, S. (2000). Synaptic plasticity: Taming the beast. Nature Neuroscience, 3, 1178–1183. Baker, W., Mahmood, A., & Carlson, B. (1996). Parallel event-driven logic simulation algorithms: Tutorial and comparative evaluation. In IEE Proc.-Circuits Devices Syst. (Vol., pp. 177–185). IEE. Brette, R. (2006). Exact simulation of integrate-and-fire models with synaptic conductances. Neural Computation, 18(8), 2004–2027.

1076

M. D’Haene, M. Hermans, and B. Schrauwen

Brette, R., Rudolph, M., Carnevale, T., Hines, M., Beeman, D., Bower, J. M., . . . Harris, F., Jr. (2007). Simulation of networks of spiking neurons: A review of tools and strategies. Journal of Computational Neuroscience, 23, 349–398. Brown, R. (1988). Calendar queues: A fast O(1) priority queue implementation for the simulation event set problem. Communications of the ACM, 31, 1220– 1227. Carnevale, N., & Hines, M. (2002). Efficient discrete event simulation of spiking neurons in NEURON. SfN abstract 312.10. 2002 SFN Meeting, Yale University, New Haven, CT. Carrillo, R., Ros, E., Ortigosa, E., Barbour, B., & Agis, R. (2005). Lookup table powered neural event-driven simulator. In Proceedings of the 8th International WorkConference on Artificial Neural Networks (pp. 168–175). New York: Springer. Cessac, B., & Vi´eville, T. (2008). On dynamics of integrate-and-fire neural networks with conductance based synapses. Frontiers in Computational Neuroscience, 2(2), 1–20. Chandy, K., & Misra, J. (1979). Distributed simulation: A case study in design and verification of distributed programs. IEEE Transactions on Software Engineering, 5(5), 440–452. Chatterjee, D., Deorio, A., & Bertacco, V. (2011). Gate-level simulation with GPU computing. ACM Trans. Des. Autom. Electron. Syst., 16, 30:1–26. Delorme, A. (2001). SpikeNET: A simulator for modeling large networks of integrate and fire neurons. Neurocomputing, 38, 539–545. de Ruyter van Steveninck, R., & Laughlin, S. (1996). The rate of information transfer at graded-potential synapses. Nature, 379, 642–645. Destexhe, A., Rudolph, M., & Par´e, D. (2003). The high-conductance state of neocortical neurons in vivo. Neuroscience, 4, 739–751. D’Haene, M. & Schrauwen, B. (2010). Fast and exact simulation methods applied on a broad range of neuron models. Neural Computation, 22(6), 1468– 1472. D’Haene, M., Schrauwen, B., Van Campenhout, J., & Stroobandt, D. (2009). Accelerating event-driven simulation of spiking neurons with multiple synaptic time constants. Neural Computation, 21(4), 1068–1099. Dolan, R., & DeSouza, G. (2009). GPU-based simulation of cellular neural networks for image processing. In Proceedings of the 2009 International Joint Conference on Neural Networks. Piscataway, NJ: IEEE Press. Fujimoto, R. (1990). Parallel discrete event simulation. Communications of the ACM, 33(10), 30–53. Goodman, D., & Brette, R. (2008). Brian: A simulator for spiking neural networks in Python. Frontiers in Neuroinformatics, 2(5). Govindaraju, N., Larsen, S., Gray, J., & Manocha, D. (2006). A memory model for scientific algorithms on graphics processors. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing. Piscataway, NJ: IEEE Press. Hansel, D., Mato, G., Meunier, C., & Neltner, L. (1998). On numerical simulations of integrate-and-fire neural networks. Neural Computation, 10, 467–483. Hanuschkin, A., Kunkel, S., Helias, M., Morrison, A., & Diesmann, M. (2010). A general and efficient method for incorporating precise spike times in globally time-driven simulations. Frontiers in Neuroinformatics, 4.

Toward Unified Hybrid Simulation Techniques for SNNs

1077

¨ Henker, S., Partzsch, J., & Schuffny, R. (2012). Accuracy evaluation of numerical methods used in state-of-the-art simulators for spiking neural networks. Journal of Computational Neuroscience, 32(2), 309–326. Hines, M., & Carnevale, N. (2001). Neuron: A tool for neuroscientists. Neuroscientist, 7, 123–135. Huang, S., Xiao, S., & Feng, W. (2009). On the energy efficiency of graphics processing units for scientific computing. In Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing (pp. 1–8), Washington, DC: IEEE Computer Society. Jefferson, D., & Sowizral, H. (1985). Fast concurrent simulation using the time warp mechanism. In SCS Conf. Distributed Simulation (pp. 63–69). Santa Monica, CA: RAND. Kaabi, M. G., Tonnelier, A., & Martinez, D. (2011). On the performance of voltage stepping for the simulation of adaptive, nonlinear integrate-and-fire neuronal networks. Neural Computation, 23(5), 1187–1204. Kirst, C., & Timme, M. (2009). How precise is the timing of action potentials? Frontiers in Neuroscience, 3(1), 2–3. Lee, G., & Farhat, N. (2001). The double queue method: A numerical method for integrate-and-fire neuron networks. Neural Networks, 14, 921–932. Lipton, R., & Mizell, D. (1990). Time warp vs. Chandy-Misra: A worst-case comparison. In D. Nicol (Ed.), Distributed simulation (pp. 137–143). Marina del Rey, CA: Information Sciences Institute. Lobb, C., Chao, Z., Fujimoto, R., & Potter, S. (2005). Parallel event-driven neural network simulations using the Hodgkin-Huxley neuron model. In Proceedings of the 19th Workshop on Principles of Advanced and Distributed Simulation (pp. 16–25). Piscataway, NJ: IEEE Press. Lytton, W., & Hines, M. (2005). Independent variable time-step integration of individual neurons for network simulations. Neural Computation, 17(4), 903–921. Maass, W. (1996). Lower bounds for the computational power of networks of spiking neurons. Neural Computation, 8(1), 1–40. Maass, W., & Markram, H. (2002). Synapses as dynamic memory buffers. Neural Networks, 15, 155–161. Mainen, Z., & Sejnowski, T. (1995). Reliability of spike timing in neocortical neuron. Science, 268(5216), 1503–1506. Marian, I., Reilly, R., & Mackey, D. (2002). Efficient event-driven simulation of spiking neural networks. In Proceedings of the 3rd WSEAS International Conference on Neural Networks and Applications. Cambridge, MA: MIT Press. Markram, H. (2006). The blue brain project. Nature Reviews Neuroscience, 7, 153– 160. Mattia, M., & Del Giudice, P. (2000). Efficient event-driven simulation of large networks of spiking neurons and dynamical synapses. Neural Computation, 12, 2305– 2329. Mihalas¸, S¸., & Niebur, E. (2009). A generalized linear integrate-and-fire neural model produces diverse spiking behaviors. Neural Computation, 21, 704–718. Morrison, A., Mehring, C., Geisel, T., Aertsen, A., & Diesmann, M. (2005). Advancing the boundaries of high-connectivity network simulation with distributed computing. Neural Computation, 17(8), 1776–1801.

1078

M. D’Haene, M. Hermans, and B. Schrauwen

Morrison, A., Straube, S., Plesser, H., & Diesmann, M. (2007). Exact subthreshold integration with continuous spike times in discrete-time neural network simulations. Neural Computation, 19(1), 47–79. Mouraud, A., Puzenat, D., & Paugam-Moisy, H. (2006). DAMNED: A distributed and multithreaded neural event driven simulation framework. In IASTEDPDCN, International Conference on Parallel and Distributed Computing and Networks (pp. 212–217). Calgary: ACTA Press. Mutch, J., Knoblich, U., & Poggio, T. (2010). CNS: A GPU-based framework for simulating cortically-organized networks (Tech. Rep.). Cambridge, MA: MIT. Nageswaran, J. M., Felch, A., Chandrasekhar, A., Dutt, N., Granger, R., Nicolau, A., & Veidenbaum, A. V. (2009). Brain derived vision algorithm on high performance architectures. International Journal of Parallel Programming, 37(4), 345–369. Nasse, F., Thurau, C., & Fink, G. (2009). Face detection using GPU-based convolutional neural networks. In Lecture Notes in Computer Science (pp. 83–90). Berlin: Springer. Pecevski, D., Natschl¨ager, T., & Schuch, K. (2009). PCSIM: A parallel simulation environment for neural circuits fully integrated with Python. Front. Neuroinformatics, 3. Press, W., Teukolsky, S., Vetterling, W., & Flannery, B. (2009). Integration of ordinary differential equations. In L. Cowles & A. Harvey (Eds.), Numerical recipes in C (pp. 899–955). Cambridge: Cambridge University Press. Raina, R., Madhavan, A., & Ng, A. (2009). Large-scale deep unsupervised learning using graphics processors. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 873–880). New York: ACM. Rangan, A., & Cai, D. (2007). Fast numerical methods for simulating large-scale integrate-and-fire neuronal networks. Journal of Computational Neuroscience, 22, 81–100. Reich, D., Mechler, F., & Victor, J. (2001). Formal and attribute-specific information in primary visual cortex. Journal of Neurophysiology, 85, 305–318. Reynolds, P. (1988). A spectrum of options for parallel simulation. In Proceedings of the 1988 Winter Simulation Conference. Richert, M., Nageswaran, J. M., Dutt, N., & Krichmar, J. L. (2011). An efficient simulation environment for modeling large-scale cortical processing. Frontiers in Neuroinformatics, 5. Rochel, O. (2004). Une approche e´v´enementielle pour la mod´elisation et la simulation de r´eseaux de neurones impulsionnels. Doctoral dissertation, Universit´e Henri Poincar´e, Nancy 1. Ros, E., Carillo, R., Ortigosa, E., Barbour, B., & Ag´ıs, R. (2006). Event-driven simulation scheme for spiking neural networks using lookup tables to characterize neuronal dynamics. Neural Computation, 18(12), 2959–2993. Rudolph, M., & Destexhe, A. (2006). Analytical integrate-and-fire neuron models with conductance-based dynamics for event-driven simulation strategies. Neural Computation, 18(9), 2146–2210. Rudolph-Lilith, M., Dubois, M., & Destexhe, A. (2012). Analytical integrate-andfire neuron models with conductance-based dynamics and realistic postsynaptic potential time course for event-driven simulation strategies. Neural Computation, 24(6), 1426–1461. Sedgewick, R. (1992). Algorithms in C++. Reading, MA: Addison-Wesley.

Toward Unified Hybrid Simulation Techniques for SNNs

1079

Shelley, M., & Tao, L. (2001). Efficient and accurate time-stepping schemes for integrate-and-fire neuronal networks. Journal of Computational Neuroscience, 11, 111–119. Shmoys, D. S. (1995). Cut Problems and their application to divide and conquer. In D. S. Hochba (Ed.), Approximation algorithms for NP-hard problems. Boston: PWS Publishing. Softky, W. (1994). Sub-millisecond coincidence detection in active dentritic trees. Neuroscience, 58, 13–41. Softky, W. R. (1995). Simple codes versus efficient codes. Current Opinion in Neurobiology, 5, 239–247. Tang, W., Goh, R., & Thng, I. (2005). Ladder queue: An o(1) priority queue structure for large-scale discrete event simulation. ACM Transactions on Modeling and Computer Simulation, 15(3), 175–204. Tonnelier, A., Belmabrouk, H., & Martinez, D. (2007). Event-driven simulations of nonlinear integrate-and-fire neurons. Neural Computation, 19(12), 3226–3238. van Elburg, R. A. J., & van Ooyen, A. (2009). Generalization of the event-based Carnevale-Hines integration scheme for integrate-and-fire models. Neural Computation, 21, 1913–1930. Wang, B., Zhu, Y., & Deng, Y. (2010). Distributed time, conservative parallel logic simulation on GPUs. In Proceedings of the 47th Design Automation Conference (pp. 761–766). New York: ACM. Watts, L. (1993). Event-driven simulation of networks of spiking neurons. In Sixth Neural Information Processing Systems Conference (pp. 927–934). San Francisco: Morgan Kaufmann. Zheng, G., Tonnelier, A., & Martinez, D. (2009). Voltage-stepping schemes for the simulation of spiking neural networks. Journal of Computational Neuroscience, 26(3), 409–423.

Received February 13, 2012; accepted September 16, 2013.

Toward unified hybrid simulation techniques for spiking neural networks.

In the field of neural network simulation techniques, the common conception is that spiking neural network simulators can be divided in two categories...
217KB Sizes 1 Downloads 3 Views