2188

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 9, SEPTEMBER 2015

Properties and Performance of Imperfect Dual Neural Network-Based kWTA Networks Ruibin Feng, Chi-Sing Leung, John Sum, and Yi Xiao Abstract— The dual neural network (DNN)-based k-winner-take-all (kWTA) model is an effective approach for finding the k largest inputs from n inputs. Its major assumption is that the threshold logic units (TLUs) can be implemented in a perfect way. However, when differential bipolar pairs are used for implementing TLUs, the transfer function of TLUs is a logistic function. This brief studies the properties of the DNN-kWTA model under this imperfect situation. We prove that, given any initial state, the network settles down at the unique equilibrium point. Besides, the energy function of the model is revealed. Based on the energy function, we propose an efficient method to study the model performance when the inputs are with continuous distribution functions. Furthermore, for uniformly distributed inputs, we derive a formula to estimate the probability that the model produces the correct outputs. Finally, for the case that the minimum separation min of the inputs is given, we prove that if the gain of the activation function is greater than 1/4min max (ln 2n, 2 ln 1 − /), then the network can produce the correct outputs with winner outputs greater than 1 −  and loser outputs less than , where  is the threshold less than 0.5.

Index Terms— Convergence, dual neural network (DNN), logistic function, winner take all (WTA). I. I NTRODUCTION The winner-take-all (WTA) process [1] is to find out the largest input from n inputs {u 1 , . . . , u n }. It plays an important role in unsupervised learning neural networks [1]–[8]. The exact convergence time of the conventional WTA was reported in [9]. Other WTA models were recently proposed in [10]–[13]. For instance, a star structure model based on l p -norm was proposed in [10]. The generalized version of the WTA process is kWTA, which is used for finding out the k largest inputs from n inputs. The kWTA model is widely used in many applications, such as order statistics filtering and sorting [14], [15]. In [16], a Hopfield-like model with n 2 connections was proposed. For this model, given n distinct inputs, for a sufficient large gain G hop of the activation function [16, Th. 4], the model can produce the correct outputs. However, the result in [16] does not tell us how large the gain should be. In [17], the properties of another model with n 2 connections were reported. Let min be the minimum separation of the inputs. If G hop > (n − 1)/min ln 4/min and some other conditions are satisfied, then the network can produce the correct outputs. The classical kWTA model consists of n nodes and n 2 connections [18]. Hu and Wang [19], Wang and Guo [20], and Wang [21] proposed a simple structure of kWTA model based on the concept Manuscript received October 24, 2013; revised July 24, 2014 and September 1, 2014; accepted September 8, 2014. Date of publication November 3, 2014; date of current version August 17, 2015. This work was supported by the Research Grants Council, Government of Hong Kong, Hong Kong, under Grant CityU 115612. R. Feng and C.-S. Leung are with the Department of Electronic Engineering, City University of Hong Kong, Hong Kong (e-mail: [email protected]; [email protected]). J. Sum is with the Institute of Technology Management, National Chung Hsing University, Taichung 40227, Taiwan (e-mail: [email protected]). Y. Xiao was with the Department of Electronic Engineering, City University of Hong Kong, Hong Kong. He is now with the College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China (e-mail: [email protected]). Digital Object Identifier 10.1109/TNNLS.2014.2358851

of dual neural network (DNN). In this DNN-kWTA model, there are n + 1 nodes and 2n connections only. In [22], the convergence time of the DNN-kWTA model was analyzed. Another kWTA model with the similar circuit complexity was proposed in [23] and [24]. The major assumption of the existing results in DNN-kWTA is that we can implement threshold logic units (TLUs) in a perfect manner. However, in the hardware implementation, comparators or amplifiers [25] have the hyperbolic–tangent relationship transfer function, when differential bipolar pairs are used. That means that the actual transfer function of a TLU is a logistic function. In this brief, we call the DNN-kWTA model with the logistic activation function as the logistic DNN-kWTA model. To the best of our knowledge, there are not many results about this imperfect DNN-kWTA model. This brief first shows that the state of the logistic DNN-kWTA model converges to the unique equilibrium point. The energy function of this model is also reported. Based on the energy function, we study the performance of the model under two cases. In the first case, we assume that the inputs are with continuous distribution functions. We propose a simple method to check whether the model produces the correct outputs or not. With the method, we can efficiently study the probability that the model produces the correct outputs. Hence, we can know how the gain (or saying logistic parameter) affects the performance. Furthermore, for uniformly distributed inputs, we derive a formula to estimate the probability that the logistic DNN-kWTA model produces the correct outputs. In the second case, we assume that the inputs are with a minimum separation min . We show that if the gain G dnn of the activation function is greater than 1/4min ln 2n, then the DNN-kWTA model can produce the correct outputs. Furthermore, if G dnn is greater than 1/2min ln 1 − /, then winner outputs are greater than 1 −  and loser outputs are less than , where  is the threshold less than 0.5. The remainder of this brief is organized as follows. Section II presents the background information. Section III analyzes the stability and convergence behavior of the model. Section IV studies the theoretical performance of the model under the two cases. Section V presents some experimental results to verify our theoretical results. This brief is then concluded in Section VI. II. BACKGROUND A DNN-kWTA network, shown in Fig. 1(a), has n input nodes, one hidden node, and n output nodes. The inputs and outputs are denoted as {u 1 , . . . , u n } and {x1 , . . . , xn }, respectively. Without loss of generality, we assume that the values of u i s are all distinct, and are bounded by 0 and 1. The network dynamics is given by n  dy xi − k = dt

(1)

output equation: xi = g(u i − y)

(2)

state equation: 

i=1

for i = 1, 2, . . . , n, where  is the characteristic time. Without loss of generality, we set  = 1. In (2), g(s) is a TLU, given by  1, if s ≥ 0 g(s) = (3) 0, otherwise

2162-237X © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 9, SEPTEMBER 2015

2189

definition of F(y), lim y→−∞ F(y) = n − k is a positive value and lim y→+∞ F(y) = −k is a negative value. In addition, F(y) is a continuous and smooth function. That means there exists at least one y ∗ such that F(y ∗ ) = 0. Note that lim y→±∞ F(y) = 0. The next step is to show that y ∗ is unique. Clearly  αe−α(u i −y) ) dF − < 0. = dy (1 + e−α(u i −y) )2 n

Fig. 1. (a) Structure of a DNN-kWTA network. (b) Dynamics of the logistic DNN-kWTA network. The inputs are {u 1 = 0.5, u 2 = 0.7, u 3 = 0.8, u 4 = 0.4, u 5 = 0.1, u 6 = 0.3}, k = 2,  = 1, and α = 50. At t = 5, y is around 0.5982 and the outputs are equal to {x1 = 0.01, x2 = 0.99, x3 = 1.00, x4 = 4.96 × 10−5 , x5 = 1.52 × 10−11 , x6 = 3.34 × 10−7 }.

let {u π1 , . . . , u πn } be the sorted inputs in ascending order, where {π1 , . . . , πn } is the sorted index list. Furthermore, let {xπ1 , . . . , xπn } be the corresponding outputs. Wang [21] showed that a DNN-kWTA network converges to an equilibrium point, where only k outputs {xπn−k+1 , . . . , xπn } are equal to 1. Other n − k outputs are equal to 0. In (3), the activation function is a TLU. However, in the implementation, we cannot achieve an ideal comparator (a step function). It is because with differential bipolar pair-based comparators or amplifiers [25], the activation function is a logistic function. We call the DNN-kWTA model with the logistic activation function as the logistic DNN-kWTA model. The dynamics of this model is given by state equation :

 dy xi − k = dt

(4)

output equation :

xi = g  (u i − y)

(5)

n

i=1

where 1 (6) 1 + e−αs is the activation function and s = u i − y is the input to the ith neuron. In (6), α is called the logistic parameter. We define the gain of the activation function as G dnn = dg  (s)/ds s=0 = α/4 [17]. Since the activation function of the model is a logistic function, the outputs cannot be exactly equal to 0 or 1. Hence, we need to use a threshold (0.5) to determine the winners or losers. Fig. 1(b) shows an illustrative example, in which n = 6, k = 2, and α = 50. The inputs are equal to {0.5, 0.7, 0.8, 0.4, 0.1, 0.3}. With y(0) = 0, y(t) converges to a point around 0.5982. At this point, the outputs are equal to {x1 = 0.01, x2 = 0.99, x3 = 1.00, x4 = 4.96 × 10−5 , x5 = 1.52 × 10−11 , x6 = 3.34 × 10−7 }. That means the winners are the second and third nodes. Instead of using 0.5 as threshold, we can consider 1 −  as the winner threshold and  as the loser threshold. In Section IV-B, we will investigate how the value of  affects the required values of α and G dnn .

That means F(y) is a strictly monotonically decreasing function except y → ±∞. This implies that y ∗ must be unique (according to the property of strictly monotonic function). It should be noted that y → +∞ and y → −∞ cannot be equilibrium points because  lim y→±∞ F(y) = 0. The proof is completed. Theorem 1 only tells us that there exists a unique equilibrium point. It does not tell us whether the network converges to this unique equilibrium point or not. Theorem 2 tells us that (4) leads to the unique equilibrium point. Theorem 2: Given any initial state y(0), (4) leads to the unique equilibrium point y ∗ . Proof: Define a scalar function V (y), given by ⎞ ⎛    n 1 dy − k⎠ d y V (y) = − dy =− ⎝ dt 1+e−α(u i −y) i=1

n 1

ln 1+e−α(u i −y) = (k −n)y + α

This section presents the stability of the logistic DNN-kWTA model. The first result, stated in Theorem 1, is that there exists a unique equilibrium point y ∗ . Theorem 1: For a logistic DNN-kWTA network, there exists a unique equilibrium point y ∗ . Proof: Let  1 dy − k. = dt 1 + e−α(u i −y) n

F (y) =

(7)

i=1

To prove the theorem, we need to show that there exists a unique y ∗ such that F(y ∗ ) = 0. First, we will show that there exists at least one y∗ ∈  (y∗ = ±∞) such that F(y ∗ ) = 0. According to the

(9)

i=1

where α = 0. We are going to show that V (y) is a Lyapunov function of (4). First of all, we will show that V (y) is radially unbounded. From the definition of V (y) lim V (y) =

g  (s) =

III. E QUILIBRIUM P OINT, S TABILITY, AND E NERGY F UNCTION

(8)

i=1

y→−∞

lim (k − n)y = +∞.

(10)

y→−∞

Since ln(·) is an increasing function V (y) > (k − n)y +α −1

n 

n

 ln e−α(u i −y) = ky − ui .

i=1

(11)

i=1

That means lim V (y) >

y→+∞

lim ky −

y→+∞

n 

u i = +∞.

(12)

i=1

From (10) and (12), V (y) is radially unbounded. Now, we are going to show that V (y) is lower bounded. Since V (y) > (k − n)y and k < n, we have V (y) > 0 for y < 0. However, if  y ≥ 0, then it follows from (11) that V (y) > ky − ni=1 u i ≥ − ni=1 u i . Therefore, we obtain  n − i=1 u i , if y ≥ 0 (13) V (y) > 0, otherwise. That means V (y) is lower bounded. Consider that    2 dV dV dy dy dy dy ≤ 0. = = − =− dt d y dt dt dt dt

(14)

That means d V /dt < 0 for ∀y = y ∗ and d V /dt = 0 for y = y ∗ (Theorem 1). Now, we can conclude that V (y) is a Lyapunov function for (4). According to the Lyapunov stability theory, the network asymptotically converges to one of the equilibrium points. From Theorem 1, there exists one equilibrium point y∗ only. That means the network asymptotically converges to y∗. The proof is completed.  Remark: Theorems 1 and 2 also hold when the activation function has the following properties: lims→+∞ g  (s) = 1, lims→−∞ g  (s) = 0, and dg  (s)/ds > 0.

2190

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 9, SEPTEMBER 2015

Fig. 2. Dynamics of the network. (a) When α = 10, y(t) converges 0.6295 and the network does not produce correct outputs. (b) When α = 100, y(t) converges 0.5750 and the network produces correct outputs.

In the logistic DNN-kWTA model, the logistic activation function is used. Therefore, at the equilibrium point y ∗ , the k largest outputs {xπn , . . . , xπn−k+1 } are not equal to 1 and the other outputs {xπn−k , . . . , xπ1 } are not equal to 0. Instead, all outputs are between 0 and 1. That means we need to use a threshold (0.5) to determine the winners. We use the following definition to define whether a logistic DNN-kWTA network produces the correct outputs or not. Definition 1: Given a set of inputs {u 1 , . . . , u n }, the network produces the correct outputs, if {xπn , . . . , xπn−k+1 } are greater than or equal to 0.5, and {xπn−k , . . . , xπ1 } are less than 0.5. In other words, the network produces the correct outputs, if < y ∗ ≤ u πn−k+1.

(15) u πn−k In some situations, a logistic DNN-kWTA network may not produce the correct outputs. Fig. 2 shows an example, where the inputs are {0.40, 0.42, 0.50, 0.55, 0.60, 0.8} and k = 2. If the logistic parameter is equal to 10, y∗ is between (0.60, 0.65) and there is one winner only. When the logistic parameter is equal to 100, y∗ is between (0.55, 0.60) and there are two winners. That means the network works properly when α = 100. Hence, it is important for us to know how the value of α affects the performance of the model (the chance of producing the correct outputs). In the rest of this brief, we will propose a simple method to check whether a network works properly or not.1 The condition for producing correct outputs is given in Definition 1 (15). However, it is difficult to verify (15) in practice due to the lack of an analytical expression on y∗. Actually, this difficulty can be overcome by the property of the energy function. From Theorems 1 and 2, V (y) is a continuous and smooth convex function that contains only one minimum point. Hence, we can use and d V /d y| y=u π to check the the values of d V /d y| y=u π n−k n−k+1 condition for producing correct outputs. Theorem 3: For a logistic DNN-kWTA network, the equilibrium point y∗ is u πn−k < y ∗ ≤ u πn−k+1 if and only if

 d V  d y  y=u π

< 0 and n−k

 d V  d y  y=u π

(16)

≥0

(17)

Fig. 3. Energy function when α = 100 and α = 10, where k = 2. The inputs are {0.40, 0.42, 0.50, 0.55, 0.60, 0.8}.

Theorem 3 gives us a quick way to study the performance of the network instead of simulating the neural dynamics. Fig. 3 shows an illustrative example, where the inputs are {0.40, 0.42, 0.50, 0.55, 0.6.0.8}. When α = 100, d V /d y| y=0.55 < 0 and d V /d y| y=u 0.6 ≥ 0. That means that the network is able to find out the k winners successfully. When α = 10, however, the network does not work properly because d V /d y| y=0.6 < 0. IV. P ERFORMANCE A NALYSIS In Section IV-A, we consider that the inputs are independently uniformly distributed in [0,1]. We will derive a formula to estimate the probability that the model produces the correct outputs. In Section IV-B, we assume that the minimum separation min of the inputs is given. We study the relationship between min and α.

A. Lower Bound on the Probability That the Network Works Properly We assume that the inputs {u 1 , . . . , u n } are independently uniformly distributed in [0,1]. We will estimate a bound on the probability that the network gives out the correct outputs. The result is summarized in Theorem 4. Theorem 4: The probability that the network gives out the correct outputs is      2 n−1 2 . Prob (correct output) ≥ 1 − 2 1 − (n −1) + 1 1 − α α (19) Proof: We define some notations in our proof. Let EU be the event that the network produces the correct outputs. Let EU1 be < 0, and let EU2 be the event that the event that d V /d y| y=u π n−k ≥ 0. Furthermore, let EV1 be the event that d V /d y| y=u π n−k+1

u πn−k+2 − u πn−k > α2 , and let EV2 be the event that u πn−k+1 − u πn−k−1 > α2 . From Definition 1 and Theorem 3, we have Prob (EU ) = Prob (EU1



EU2 ) ≥ 1 − Prob (EU1 )−Prob (EU2 ).

n−k+1

(20)

where n  1 dV =k− . (18) dy 1 + exp(−α(u i − y)) i=1 Proof: From the proof of Theorem 1, we know that d V /d y = −d y/dt is a strictly monotonically increasing function of y. In addition, d V /d y| y=y ∗ = 0. Hence, u πn−k < y ∗ if and only < 0. Similarly, y ∗ ≤ u πn−k+1 if and only if if d V /d y| y=u π n−k ≥ 0. The proof is completed.  d V /d y| y=u π n−k+1

1 It should be noticed that without a simple method, to study the statistical performance, we need to simulate the network dynamics. Hence, intensive network simulations on the network dynamics are required.

To estimate Prob(EU1 ) and Prob(EU2 ), we make an approximation, shown in Fig. 4, on g  (s), given by ⎧ ⎪ ifs ≤ − α2 ⎨0,  ∗ 1 α g (s) ≈ g (s) = 4 s + 2 , (21) if α2 < s ≤ α2 ⎪ ⎩ 1, ifs > α2 . With this approximation, we have  dV g ∗ (u πi − y). ≈k− dy n

i=1

(22)

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 9, SEPTEMBER 2015

Fig. 4.

2191

Approximation of logistic function, where α = 10. Fig. 5. Required gain for two different kWTA models. G dnn is the gain of the activation function in the logistical DNN-kWTA model. G hop is the gain of the activation in the Hopfield model in [17].

Based on Definition 2 and (21), we have  d V  d y  y=u π

≈ k− n−k

n−k−1 

g ∗ (u πi − u πn−k )

i=1

−g ∗ (u πn−k − u πn−k ) − g ∗ (u πn−k+1 − u πn−k ) n  g ∗ (u πi − u πn−k ). (23) −

Proof: According to the property of V (y), if d V /d y| y=u π 0, then the network works properly. Define and d V /d y| y=u π n−k+1  = d V /d y| y=u π . From (18) n−k



i=n−k+2

n 

g ∗ (u

πi − u πn−k ) =

i=n−k+2

n 

1 = k − 1.

(24)

In addition, from the property of g ∗ (·), we obtain g ∗ (u πi − u πn−k ) > 0

(25)

i=1

1 2 1 − u πn−k ) > . 2

g ∗ (u πn−k − u πn−k ) = g ∗ (u πn−k+1

i=1

1 + e−α(u πi −u πn−k )

⎛ 1 − −⎝ 2

i=n−k+2

n−k−1 

1

 = k−⎝

If u πn−k+2 − u πn−k > α2 , then

From (23)–(27), if u πn−k+2 − u πn−k > 2/α, then d V /d y| y=u π < 0. n−k That means EV1 implies EU1 . Similarly, we can prove that EV2 implies EU2 . Since EV1 implies EU1 and EV2 implies EU2 Prob (E V1 ) ≥ Prob (EU1 ) and Prob (E V2 ) ≥ Prob (EU2 ). (28) Besides, let δ = u πi+2 − u πi . According to order statistics theories [26], the probability density function of δ is given by f (δ) = n(n − 1)δ(1 − δ)n−2 .

(29)

Hence, Prob(EV1 ) and Prob(EV2 ) are given by  2 α Prob(E V1 ) = Prob(E V2 ) = n(n − 1)δ(1 − δ)n−2 dδ 0

2 n−1 2 . (30) = 1 − (n − 1) + 1 1 − α α From (20) and (28)–(30), the probability that the model produces the correct outputs is bounded by

2 n−1 2 . (31) Prob(EU ) ≥ 1 − 2 1 − (n − 1) + 1 1 − α α The proof is completed.



B. Gain of Transfer Function for a Given Minimum Separation Let  be min{|u i − u j |; i = j } in a set of inputs. Furthermore, let min be the minimum of ’s over all possible sets of inputs. Theorem 5 tells us the relationship between α and min . Theorem 5: If α > 1/min ln 2n, then the network gives out the correct outputs.

n 



1

−α(u πi −u πn−k ) i=n−k+1 1 + e

⎞ ⎠.

(32)

According to the property of 1/1 + e−α(u πi −u πn−k ) ⎛ ⎞ ⎞ ⎛ n−k−1 n   1 1 ⎠. (33)  < k −⎝ 0⎠ − − ⎝ −α(u πi −u πn−k ) 2 1+e i=1

(26) (27)



n−k−1 

i=n−k+1

Since 1/1 + e−αmin ≤ 1/1 + e  < k−

−α(u πi −u πn−k )

for i ≥ n − k + 1

1 k . − 2 1 + e−αmin

(34)

From (34), if α > 1/min ln(2k −1), then k −1/2−k/1 + e−αmin < < 0. Using the similar method, 0 and therefore d V /d y| y=u π n−k we can show that if α > 1/min log(2(n − k) − 1), then > 0. Combining the two inequalities, we conclude d V /d y| y=u π n−k+1 that if α>

1 1 ln 2n, or saying G dnn > ln 2n min 4min

(35)

then the network produces the correct outputs. The proof is completed.  Theorem 5 also implies that for a sufficient large α, we have u πn−k < y ∗ < u πn−k+1 . Hence, we have the following theorem to give a hint on y ∗ for large α. Theorem 6: For a sufficient large α, y∗ ≈ u πn−k+1 + u πn−k /2. Proof: From Theorem 5, for a sufficient large α, we have u πn−k < y ∗ < u πn−k+1 . At the equilibrium point y ∗  dy 1 = ∗ − k = 0. dt 1 + e−α(u πi −y ) n

(36)

i=1

The above equation can be rewritten as ⎞ ⎛ n−k−1  1 1 ⎠+ ⎝ −α(u πi −y ∗ ) −α(u πn−k −y ∗ ) 1+e i=1 1 + e ⎛ ⎞ n  1 1 ⎠ = k. + +⎝ −α(u πn−k+1 −y ∗ ) −α(u πi −y ∗ ) 1 + e 1+e i=n−k+2 (37)

2192

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 9, SEPTEMBER 2015

Fig. 6. Successful rates (measured based on Theorem 3) for producing the correct output, where the distribution is beta distribution. Note that for n = 5, the performance of k = 2 is very similar to that of k = 3.

For i ≤ n − k − 1, we have u π1 < u π2 < · · · < u πn−k < y ∗ , and − (u πi − y ∗ ) = − (u πn−k − y ∗ )+(u πn−k −u πi ) ≥ − (u πn−k

− y ∗ )+(u

πn−k −u πn−k−1) > −(u πn−k

− y ∗ ). (38)

Therefore 1 1 ∗ ≤ ∗ 1 + e−α(u πi −y ) 1 + e−α(u πn−k −y ) eα(u πn−k −u πn−k−1 ) 1 < . (39) −α(u πn−k −y ∗ ) 1+e For a large α (α >> 1/u πn−k − u πn−k−1 ), we obtain 1 1+e

−α(u πi −y ∗ )




2 1− ln min 

(47)

then the biggest loser has the output less than . Similarly, we can show that if α > 2/min ln 1 − /, then the smallest winner has the output greater than 1 − . Combining Theorem 5 and (47) together, a general rule is obtained, given by   1− 1 max ln 2n, 2 ln (48) α > min    1 1− G dnn > max ln 2n, 2 ln . (49) 4min  The proof is completed.  We show the relationship between min and G dnn for n = {5, 21} in Fig. 5. In the logistic model, the output is given by g  (s) =  1/1 + e−αs and the gain is G dnn = dg  (s)/ds s=0 = α/4. For completeness, we also show the gain G hop for the Hopfield model in [17]. In [17], the output activation function is given by f (s) = m tanh(αs) and the gain is G hop = mα. V. S IMULATIONS A. Application of Theorem 3 This section studies how the value of α affects the performance of the network when the inputs are with the beta distribution: Betaa,b (u) = (a + b))/(a)bu a−1 (1 − u)b−1 , where (·) is the gamma function. The beta distribution is a family of distribution functions defined on the interval between 0 and 1. In this experiment, a and b are set to 1. We consider various settings: n = {5, 11, 21} and α ∈ [1, 900]. For n = 5, we set k = {1, 2, 3}. For n = 11, we set k = {2, 6, 10}. For n = 21, we set k = {2, 11, 20}. For each setting, we generate 100 000 sets of inputs. We use Theorem 4 to check the performance of the logistic DNN-kWTA model. The results are summarized in Fig. 6. It shows how the value of the logistic parameter affects the performance of the model. To achieve a high successful rate of producing correct outputs, saying 0.99, the value of the logistic parameter should be large. For example, for n = 5, the logistic parameter should be around 70–100. For n = 21, the logistic parameter should be around 210–540. B. Theoretical Lower Bound In Section IV, for uniform distribution, we have developed a formula to estimate the probability that the logistic DNN-kWTA model produces the correct outputs. This section studies the quality of the lower bound formula. We consider various settings: n = {5, 11, 21}. For each setting, we use two methods to estimate the performance of the network. One is based on the direct measurement (Theorem 3), and another one is based on the lower bound formula (Theorem 4). In the direct measurement method,

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 26, NO. 9, SEPTEMBER 2015

2193

Fig. 7. Comparison of successful rates for producing correct outputs. The solid line is based on the direct measurement (Theorem 3). The dotted line is based on the lower bound formula (Theorem 4). In the direct measurement method, for each n and each α, we generate 100 000 sets of inputs. We then use Theorem 3 to check the performance of the logistic DNN-kWTA model. For the lower bound method, we directly use Theorem 4 to estimate the probability that the DNN-kWTA model produces the correct outputs.

for each n and each α, we generate 100 000 sets of inputs. We then use Theorem 3 to check the performance of the logistic DNN-kWTA model. For the lower bound method, we use Theorem 4 to estimate the probability that the logistic DNN-kWTA model produces the correct outputs. The results are summarized in Fig. 7. It can be seen that the lower bound formula produces a good approximation on the successful rate, especially for large values of α. VI. C ONCLUSION This brief analyzes the stability and convergence of the DNN-kWTA model, where imperfect comparators are considered. We showed that there exists a unique equilibrium (Theorem 1). Besides, we showed that the state of the model converges to this unique equilibrium (Theorem 2). Based on our proposed energy function, we proposed a method (Theorem 3) to check whether the logistic DNN-kWTA model produces the correct outputs or not. The advantage of the proposed method is that we do not need to simulate the neural dynamics. With the method, we can study the performance of the logistic DNN-kWTA model. Besides, for uniformly distributed inputs, we derived a formula (Theorem 4) to estimate the probability that the logistic DNN-kWTA model produces the correct outputs. Finally, we investigate another situation, where the minimum separation is given. We showed that if the gain G dnn is greater than 1/4min ln 2n, then the network produces the correct outputs (Theorem 5). In addition, we showed that for large α, the equilibrium point y ∗ is approximately equal to u πn−k + u πn−k+1 /2 (Theorem 6). Based on Theorems 5 and 6, we show that if α > 1/min max(ln 2n, 2 ln 1 − /), then the network gives out the correct outputs with winner outputs greater than 1 −  and loser outputs less than  (Theorem 7). R EFERENCES [1] J. Lazzaro, S. Ryckebusch, M. A. Mahowald, and C. A. Mead, “Winnertake-all networks of O(N) complexity,” in Advances in Neural Information Processing Systems, D. S. Touretzky, Ed. San Francisco, CA, USA: Morgan Kaufmann, 1989, pp. 703–711. [2] R. P. Lippmann, “An introduction to computing with neural nets,” IEEE ASSP Mag., vol. 4, no. 2, pp. 4–22, Apr. 1987. [3] J. J. Hopfield, “Neurons with graded response have collective computational properties like those of two-state neurons,” in Neurocomputing: Foundations of Research, J. A. Anderson and E. Rosenfeld, Eds. Cambridge, MA, USA: MIT Press, 1988, pp. 577–583. [4] L. O. Chua and L. Yang, “Cellular neural networks: Applications,” IEEE Trans. Circuits Syst., vol. 35, no. 10, pp. 1273–1290, Oct. 1988. [5] E. Majani, R. Erlanson, and Y. Abu-Mostafa, “On the k-winners-takeall network,” in Advances in Neural Information Processing Systems, D. S. Touretzky, Ed. San Francisco, CA, USA: Morgan Kaufmann, 1989, pp. 634–642. [6] G. L. Dempsey and E. S. McVey, “Circuit implementation of a peak detector neural network,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 40, no. 9, pp. 585–591, Sep. 1993. [7] G. Seiler and J. A. Nossek, “Winner-take-all cellular neural networks,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 40, no. 3, pp. 184–190, Mar. 1993.

[8] L. L. H. Andrew, “Improving the robustness of winner-take-all cellular neural networks,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 43, no. 4, pp. 329–334, Apr. 1996. [9] J. P. F. Sum, C.-S. Leung, P. K. S. Tam, G. H. Young, W. K. Kan, and L.-W. Chan, “Analysis for a class of winner-takeall model,” IEEE Trans. Neural Netw., vol. 10, no. 1, pp. 64–71, Jan. 1999. [10] S. Li, B. Liu, and Y. Li, “Selective positive–negative feedback produces the winner-take-all competition in recurrent neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 2, pp. 301–309, Feb. 2013. [11] S. Li, Y. Li, and Z. Wang, “A class of finite-time dual neural networks for solving quadratic programming problems and its k-winners-take-all application,” Neural Netw., vol. 39, pp. 27–39, Mar. 2013. [12] S. Li, J. Yu, M. Pan, and S. Chen, “Winner-take-all based on discretetime dynamic feedback,” Appl. Math. Comput., vol. 219, no. 4, pp. 1569–1575, Nov. 2012. [13] S. Li, Y. Wang, J. Yu, and B. Liu, “A nonlinear model to generate the winner-take-all competition,” Commun. Nonlinear Sci. Numer. Simul., vol. 18, no. 3, pp. 435–442, Mar. 2013. [14] T. M. Kown and M. Zervakis, “KWTA networks and their applications,” Multidimensional Syst. Signal Process., vol. 6, no. 4, pp. 333–346, Oct. 1995. [15] J. D. Narkiewicz and W. P. Burleson, “Rank-order filtering algorithms: A comparison of VLSI implementations,” in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), vol. 3. May 1993, pp. 1941–1944. [16] D. Danciu and V. R˘asvan, “Gradient like behavior and high gain design of KWTA neural networks,” in Bio-Inspired Systems: Computational and Ambient Intelligence (Lecture Notes in Computer Science), vol. 5517, J. Cabestany, F. Sandoval, A. Prieto, and J. M. Corchado, Eds. Berlin, Germany: Springer-Verlag, 2009, pp. 24–32. [17] R. L. Costea and C. A. Marinov, “New accurate and flexible design procedure for a stable KWTA continuous time network,” IEEE Trans. Neural Netw., vol. 22, no. 9, pp. 1357–1367, Sep. 2011. [18] C. A. Marinov, B. A. Calvert, R. Costea, and V. Bucata, “Time evaluation for analog KWTA processors,” in Proc. Eur. Congr. Comput. Methods Appl. Sci. Eng. (ECCOMAS), Jul. 2004. [19] X. Hu and J. Wang, “An improved dual neural network for solving a class of quadratic programming problems and its κ-winners-take-all application,” IEEE Trans. Neural Netw., vol. 19, no. 12, pp. 2022–2031, Dec. 2008. [20] J. Wang and Z. Guo, “Parametric sensitivity and scalability of k-winnerstake-all networks with a single state variable and infinity-gain activation functions,” in Proc. ISNN, vol. 6063. 2010, pp. 77–85. [21] J. Wang, “Analysis and design of a κ-winners-take-all model with a single state variable and the heaviside step activation function,” IEEE Trans. Neural Netw., vol. 21, no. 9, pp. 1496–1506, Sep. 2010. [22] Y. Xiao, Y. Liu, C.-S. Leung, J. Sum, and K. Ho, “Analysis on the convergence time of dual neural network based κWTA,” IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 4, pp. 676–682, Apr. 2012. [23] P. V. Tymoshchuk, “A fast analogue K-winners-take-all neural circuit,” in Proc. Int. Joint Conf. Neural Netw. (IJCNN), Aug. 2013, pp. 1–8. [24] P. V. Tymoshchuk, “A model of analogue K-winners-take-all neural circuit,” Neural Netw., vol. 42, pp. 44–61, Jun. 2013. [25] A. Moscovici, High Speed A/D Converters: Understanding Data Converters Through SPICE. Norwell, MA, USA: Kluwer, 2001. [26] B. C. Arnold, N. Balakrishnan, and H. N. Nagaraja, A First Course in Order Statistics (Classics in Applied Mathematics). Philadelphia, PA, USA: SIAM, 2008.

Properties and Performance of Imperfect Dual Neural Network-Based kWTA Networks.

The dual neural network (DNN)-based k -winner-take-all ( k WTA) model is an effective approach for finding the k largest inputs from n inputs. Its maj...
867KB Sizes 3 Downloads 6 Views