216

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 1, JANUARY 2014

An Interval Type-2 Neural Fuzzy Chip With On-Chip Incremental Learning Ability for Time-Varying Data Sequence Prediction and System Control Chia-Feng Juang, Senior Member, IEEE, and Chi-You Chen Abstract— This paper proposes a new circuit to implement a Mamdani-type interval type-2 neural fuzzy chip with on-chip incremental learning ability (IT2NFC-OL) for applications in changing environments. Traditional interval type-2 fuzzy systems use an iterative procedure to find the system outputs, which is computationally expensive, especially for hardware implementation. To address this problem, the IT2NFC-OL uses a simplified type reduction operation to reduce the hardware implementation cost without degrading the learning performance. The software-implemented IT2NFC-OL is characterized by online structure learning and parameter learning using a gradient descent algorithm. The learned fuzzy model is then implemented in a field-programmable gate array (FPGA) chip. The FPGAimplemented IT2NFC-OL performs not only fuzzy inference but also online consequent parameter learning for applications in changing environments. Novel circuits for the computation of system outputs and the update of interval consequent values are proposed. The learning performance of the software-implemented IT2NFC-OL and the on-chip learning ability are verified with applications to time-varying data sequence prediction and system control problems and by comparisons with different softwareimplemented type-1 and type-2 neural fuzzy systems and interval type-2 fuzzy chips. Index Terms— Fuzzy chip, incremental learning, neural fuzzy systems, on-chip learning ability, type-2 fuzzy systems.

I. I NTRODUCTION

T

HE DEVELOPMENT of online incremental learning methods dedicated to dynamically changing environments has drawn much research interest in recent years [1]–[3]. Fuzzy logic systems (FLSs) have demonstrated an approximate reasoning capability typical of humans and have been successfully applied to modeling and control problems, among others. Neural fuzzy systems (NFSs) that bring the low-level learning ability to an FLS help to evolve its system structure (rule base) and adapt its parameters. To learn from data streams whose underlying distributions change over time, the development of NFSs with online learning abilities and computationally efficient algorithms for real-time operation is especially important.

Manuscript received November 30, 2012; revised March 10, 2013; accepted March 13, 2013. Date of publication April 15, 2013; date of current version December 13, 2013. This work was supported by the National Science Council, Taiwan, under Grant NSC 100-2628-E-005-005-MY2. The authors are with the Department of Electrical Engineering, National Chung-Hsing University, Taichung 402, Taiwan (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2013.2253799

Several type-1 NFSs with online learning ability have been proposed to address the nonstationary characteristics of their operating environments [1], [4]–[7]. Recently, type-2 FLSs have emerged as a generalization of type-1 FLSs [8]. Type-2 FLSs use type-2 fuzzy sets in the antecedent or consequent parts of rules, which allows researchers to model and minimize the effects of uncertainties in rule-based systems. The experimental evidence from previous studies has shown the improvement of interval type-2 FLSs over their type-1 counterparts in terms of regression accuracy or control performance [8]–[14], such as the control of autonomous mobile robots [10], [14]. Several studies on the design and optimization of interval type-2 fuzzy controllers using bio-inspired populationbased algorithms have been proposed [14]–[17]. Bio-inspired population-based algorithms are inherently an offline learning approach and are unsuitable for online learning in changing environments considered in this paper. Several interval type-2 NFSs have been proposed to incrementally optimize the parameters of interval type-2 FLSs [11], [18]. In addition to parameter learning, interval type-2 NFSs with offline [12], [19] or online [9], [20] structure learning have been proposed. In [20], a type-2 self-organizing neural fuzzy system (T2SONFS) with an online structure and parameter learning ability was proposed. The T2SONFS uses a typical type reduction operation, namely the iterative Karnik– Mendel (K–M) procedure [8], to find the system extended output, which is computationally expensive. Several simplified type reduction operations have been proposed to reduce the computation cost. One popular category is the use of a closed-form operation to approximate the reducer outputs [10], [21]–[23]. These approaches use the standard normalization operation in the computation process and are still computationally expensive when parameter learning is considered in chip implementation. The idea of unnormalized type reduction operation, which avoids division operations, was proposed in [24]. To reduce type-2 fuzzy chip implementation cost and improve execution speed, this paper modifies the typereduction operation in the T2SONFS with a new unnormalized type-reduction operation and derives a new parameter learning algorithm accordingly. Fuzzy chips are suitable for real-time applications requiring high inference speed. In addition, they have the advantages of small size and low power consumption. For type-1 fuzzy chips with fixed parameters, well-established architectures

2162-237X © 2013 IEEE

JUANG AND CHEN: NEURAL FUZZY CHIP WITH ON-CHIP INCREMENTAL LEARNING ABILITY

making use of the rule parallel execution property and pipeline technique have been proposed [25]–[34]. For applications to dynamically changing environments, a fuzzy chip with parameter learning ability is required. The parameter learning algorithm in general type-1 NFSs is too expensive computationally to be completely implemented in a fuzzy chip. To address this problem, one approach is to simplify fuzzy operations [35], [36], such as the defuzzification operation, so that the chip has an on-chip learning ability. Another approach is hardware/software codesign [37], where the fuzzy inference function and learning algorithm are hardware- and softwareimplemented, respectively. In contrast to type-1 fuzzy chips, which process crisp values, interval type-2 fuzzy chips process intervals that require a relatively heavy computational load. In particular, finding the two interval boundary points in the extended output of an interval type-2 FLS requires an iterative procedure, such as the K–M procedure [8], which is computationally expensive. Some hardware implementations of interval type-2 FLSs have been proposed [20], [38]–[42]. The implementation of an interval type-2 fuzzy chip with the K–M procedure was proposed in [41]. To reduce the implementation cost, the circuits in [38] and [40] used the Wu–Mendel closed form [23] for boundary point estimation. However, it is still costly to implement the Wu–Mendel closed form, especially for on-chip parameter learning. Simulations on hardware implementation of the defuzzification stage in interval type-2 fuzzy systems using the average of two type-1 fuzzy systems were proposed in [42]. In hardware-implemented T2SONFS (H-T2SONFS) [20], a look-up table (LUT) circuit was proposed to store the left and right crossover points in advance for system output calculation, which avoids an iterative procedure for the type-reduction operation. However, the size of the LUT increases exponentially with the number of input variables and is therefore unsuitable for problems with high-dimensional inputs. Most important of all, this technique is feasible only for FLSs with fixed parameters. Contributions of this paper are twofold. First, an interval type-2 NFS using a simplified type-reduction operation is proposed. The simplified interval type-2 NFS incrementally evolves its structure and parameters according to online training data streams. This online adaptive learning ability makes it feasible for learning data streams generated from nonstationary environments [43], [44]. Second, a circuit is proposed to implement an interval type-2 neural fuzzy chip with onchip incremental learning ability (IT2NFC-OL). In contrast to the current interval type-2 fuzzy chips that are designed for fixed system parameters without on-chip learning ability, one major contribution of the proposed IT2NFC-OL is its onchip learning ability. To the best of our knowledge, the chip proposed in this paper is the first type-2 fuzzy chip with onchip incremental parameter learning ability. The IT2NFC-OL chip is applied to time-varying data stream prediction and system control problems to verify its effectiveness. The rest of this paper is organized as follows. Section II introduces the mathematical functions in the IT2NFC-OL. Section III introduces the software-implemented structure

217

and parameter learning of the IT2NFC-OL. Section IV introduces the circuits for the implementation of the IT2NFC-OL. Section V presents examples of prediction and control problems using the IT2NFC-OL. Section VI presents a discussion of the implementation techniques and performances of different interval type-2 fuzzy chips. Finally, Section VII presents the conclusions. II. IT2NFC-OL F UNCTIONS This section introduces the input-output mathematical functions in the IT2NFC-OL, in particular, the simplified typereduction operation used for hardware cost reduction. The IT2NFC-OL uses Mamdani-type fuzzy rules, each of which has the following form: Rule i : IF x 1 is A˜ i1 AND . . . AND x n is A˜ in , THEN y is G˜ i , i = 1, . . . , M A˜ ij ,

(1)

G˜ i

where j = 1, . . . , n and are interval type-2 fuzzy sets, and M is the number of rules. Each input x i is normalized to be within [−1, 1]. For fuzzy set A˜ ij , a Gaussian primary membership function (MF) having a fixed standard deviation σ ji and an uncertain mean that takes on values in [m ij 1, m ij 2 ] is used. The footprint of uncertainty of this MF can be represented as a bounded interval [μ A˜ i , μ A˜ i ], where j

j

⎧ i 2 i 2 i ⎪ ⎨ exp{−(x j − m j 1 ) /(σ j ) }, x j < m j 1 i m j 1 ≤ x j ≤ m ij 2 (2) μ¯ A˜ i (x j ) = 1, j ⎪ ⎩ exp{−(x − m i )2 /(σ i )2 }, x > m i j j j2 j j2

and

⎧ ⎨ exp{−(x − m i )2 /(σ i )2 }, x ≤ j j j2 j μ A˜ i (x j ) = ⎩ exp{−(x − m i )2 /(σ i )2 }, x > j j j j1 j

m ij 1 +m ij 2 2 m ij 1 +m ij 2 . 2

(3)

In the fuzzy meet operation [8], the t-norm operation is implemented using an algebraic product function, and the obtained firing strength F i is an interval given as follows: F i = [ f i , f¯i ]

(4)

where f¯i =

n 

μ¯ A˜ i , j

j =1

fi=

n  j =1

μ A˜ i .

(5)

j

The fuzzy set G˜ i also uses a Gaussian primary MF. On the basis of the center-of-sets type-reduction method [8], G˜ i is reduced to an interval type-1 fuzzy set [wli , wri ], and the extended output [yl , yr ] is computed as follows:     [yl , yr ] = ··· ··· w M ∈[wlM ,wrM ] f 1 ∈[ f 1 , f¯1 ]

w1 ∈[wl1 ,wr1 ]

×1

 M

i=1

M

f i wi

i=1

fi

.

f M ∈[ f M , f¯ M ]

(6)

There is no closed-form solution to (6). The computation of the reduced set requires an iterative procedure, such as the K–M iterative procedure [8]. The iterative computation is complex

218

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 1, JANUARY 2014

and time consuming, especially for hardware implementation. For this reason, the IT2NFC-OL uses the following simplified type-reduction operation [24], i.e., without the normalization term, and the extended output is computed as follows:     [yl , yr ] = ··· ··· w M ∈[wlM ,wrM ] f 1 ∈[ f 1 , f¯1 ]

w1 ∈[wl1 ,wr1 ]

×1

M 

f M ∈[ f M , f¯ M ]

f w. i

i

(7)

i=1

to cover the input data, i.e., M(t + 1) = M(t) + 1, where φth ∈ (0, 1) is a prespecified threshold. Once a new rule is generated, the initial means and widths of the corresponding new interval type-2 FSs are assigned as follows:   )+1 M(t )+1 = x j (t) − 0.1, x j (t) + 0.1 (14) , m m M(t j1 j2 and M(t )+1

σj

j =1

It is equivalent to computing [yl , yr ] =

M

i

i

f w =

i=1

M

i

¯i

f ,f



×



wli , wri

(15)

(8)

i=1

where yl and yr denote the minimum and maximum values in the output interval, respectively. Because the upper MF, μ A˜ i , j and the lower MF, μ A˜ i , are positive values, their algebraic i

j

products f i and f in (5) are positive values. The consequent [wli , wri ] is an interval, and wli and wri may be positive or negative. On the basis of interval product operation, the outputs yl and yr in (8) are expressed as follows:

i M

f¯ , if wli ≤ 0 fli wli , fli = yl = (9) f i , if wli > 0 i=1

and yr =

M



fri wri ,

⎧ , M(t) = 0 ⎪ ⎨ σinit   i  2 0.5 i n = m j 1 +m j 2 ⎪ , M(t) ≥ 0 xj − ⎩β 2

fri =

i=1

f i , if wri ≤ 0 f¯i , if wri > 0.

(10)

Finally, the defuzzification operation computes the average of yl and yr , and the final system output is yl + yr . (11) y= 2 III. L EARNING IN IT2NFC-OL This section introduces the software-implemented IT2NFC-OL learning functions, including their online structure and parameter learning.

where β is an overlapping parameter and σinit = 0.1 is a preassigned fuzzy-set width of the first fuzzy rule. In this paper, we set β to 0.5 so that the initial width of all the new fuzzy sets in rule M(t) + 1 is half of the Euclidean distance between the current input data x and its nearest rule mean center (m Ij 1 + m Ij 2 )/2. This assignment generates a suitable overlap between rules in the input space. If the desired output  y d for input x (t) is available, the initial consequent parameters i i [wl , wr ] are set to [y d , y d ]; otherwise, the initial consequent parameters are set to small random values. B. Parameter Learning Parameter learning in the software-implemented IT2NFC-OL is introduced as follows. Consider the singleoutput case for clarity, where the objective is to minimize the error function 1 E = [y(t + 1) − y d (t + 1)]2 . (16) 2 The gradient descent algorithm has been widely used in parameter learning of NFSs [1], [4]–[7], [45]–[48]. The antecedent and consequent parameters in the IT2NFC-OL are tuned using the gradient descent algorithm. The consequent parameters are updated as follows: w˜ li (t + 1) = wli (t) − η

1≤i≤M(t )

where M(t) is the number of existing rules at time t. If  F I ( x ) ≤ φth or M(t) = 0, then a new rule is generated

(17)

∂wli ∂E w˜ ri (t + 1) = wri (t) − η i ∂wr

A. Online Structure Learning There are no rules in the IT2NFC-OL. Rules are incrementally learned with the arrival of each training data sample from the emergence of novel training data. The rule firing strength has been used as a criterion for fuzzy rule generation in type-1 [4] and type-2 NFSs [9], [20]. As in [20], the center Fci of the interval firing strength (4) is computed as 1 Fci = ( f¯i + f i ). (12) 2 The value of Fci is then used as the rule generation criterion.  That is, for each piece of incoming data x = (x 1 , . . . , x n ), find    (13) I = arg max Fci x

∂E

where

⎧1  ⎪ y − yd × ⎨ ∂E = 2  ⎪ ∂wli ⎩ 1 y − yd × 2 ⎧1  ⎪ y − yd × ⎨ ∂E 2 =   ⎪ ∂wri ⎩ 1 y − yd × 2

(18)

f i,

wli > 0

f¯i ,

wli ≤ 0

f¯i ,

wri > 0

f ,

wri

(19)

(20) i

≤ 0.

To ensure that wli (t + 1) ≤ wri (t + 1) after update, the final values of the consequent parameters are given as follows: i [wli (t

+ i1), wr (t + i1)] [w˜ l (t + 1), w˜ r (t + 1)], if w˜ li (t + 1) ≤ w˜ ri (t + 1) (21) = [w˜ ri (t + 1), w˜ li (t + 1)], otherwise.

JUANG AND CHEN: NEURAL FUZZY CHIP WITH ON-CHIP INCREMENTAL LEARNING ABILITY

Let θ ij denote one of the three antecedent parameters m ij 1, m ij 2 , and σ ji . The antecedent parameters are updated as follows: θ ij (t + 1) = θ ij (t) − η The chain rule is used to compute given as follows:  1  ∂E d × y − y = 2 ∂θ ij ⎡

∂E

. i

∂ E/∂θ ij ,

TABLE I R ESULTS OF THE T ERMS ( A) AND ( B ) IN (23)

XX XXX ( A) XX X

(22)

∂θ j

∂ yr ∂ f¯i

wri

∂ yl + ∂ y¯ri ∂ f¯i ∂f wli wli + wri

0

0

0

0

wri

wri

XXX XX (B) XX X

∂ yl ∂ fi

∂ yr ∂ fi

∂ yl + ∂ yri ∂ fi ∂f

wli ≤ 0, wri ≤ 0

0

wli

wli

wli ≤ 0, wri > 0

0

0

0

wli > 0, wri ≤ 0

wli

wri

wli + wri

wli > 0, wri > 0

wli

0

wli

wli ≤ 0, wli > 0, wli > 0,

and the result is



(B)

In (9) and (10), the values of fli and fri are determined by the signs of the consequent parameters wli and wri . As a result, the two terms (A) and (B) in (23) are determined by the signs of wli and wri , which generate the four conditions in Table I. Substitution of the results in Table I into (23) gives (24)–(26), shown at the bottom of this page. For each training sample, the parameter learning is performed following the structure learning as in [20]. Therefore, the software-implemented IT2NFC-OL can be applied to online learning problems. IV. H ARDWARE I MPLEMENTATION For a given set of collected training data, learning using software-IT2NFC-OL is performed first. The structure and parameters in the software-implemented IT2NFC-OL can then be implemented using hardware for practical applications requiring high computation speed. Rules with fixed parameters learned in a specific scenario may not work well when the ⎧ ! ⎪ ⎪ 12 × y − y d × [(A)] × f¯i × ⎪ ⎪ ⎪ ⎪ ⎨ i i ∂E i ≤ x ≤ m j 1 +m j 2 = 0, m j i j1 2 ⎪ ∂m j 1 ⎪ ⎪ ! ⎪ 1 i d ⎪ ⎪ ⎩ 2 × y − y × [(B)] × f × ⎧ ⎪ 1 ⎪ ⎪2× ⎪ ⎪ ⎪ ⎨ ∂E = 0, i ⎪ ∂m j 2 ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎩2× ⎧ ⎪ 1 ⎪ ⎪2× ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 21 × ∂E = ⎪ 1 ∂σ ji ⎪ ⎪ 2 × ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 21 ×

∂ yl ∂ f¯i wli wli

wli ≤ 0, wri < 0

  ⎢  i⎥ ⎢ ∂yl ∂yr ∂ f¯i ∂yl ∂yr ∂ f ⎥ ⎢ ⎥. (23) ×⎢ + + + i ∂ f i ∂ f i ∂θ ij⎥ ⎣ ∂ f¯i  ∂ f¯i  ∂θ j ⎦    ( A)

219

! i y − y d × [(A)] × f × m ij 1 +m ij 2 2

≤ x j ≤ m ij 2

! y − y d × [(B)] × f i ×

wri ≥ 0 wri < 0 wri ≥ 0

0

scenario changes, such as a change of system parameters or operating conditions. For these problems, on-chip learning ability is crucial. Parameter learning consists of antecedent and consequent learning, with the former being much more complex than the latter in terms of computation. To address the tradeoff between on-chip learning performance and hardware implementation cost, this paper resorts to the implementation of only the consequent parameter learning algorithm in the IT2NFC-OL. Fig. 1 shows the hardware implementation architecture of the IT2NFC-OL. The architecture is specified to have n inputs, one output, and M rules. Because the IT2NFC-OL performs interval operations to find the final output, two parallel circuits are used to compute the left and right end points yl and yr and update the left and right consequent values wli and wri . The IT2NFC-OL consists of six modules: 1) antecedent module; 2) extended output module;

(x j −m ij 1 )  2 , σ ij

x j < m ij 1 (24)

(x j −m ij 1 )  2 σ ij

,

(x j −m ij 2 )  2 , σ ij

xj >

m ij 1 +m ij 2 2

x j > m ij 2 (25)

(x j −m ij 2 )  2 σ ij

,

xj ≤

m ij 1 +m ij 2 2

! (x j −m ij 1 )2 (x j −m ij 2 )2 y − y d × [(A) × f¯i × + (B) × f i × ], x j < m ij 1 (σ ij )3 (σ ij )3 i i ! (x j −m ij 2 )2 i ≤ x ≤ m j 1 +m j 2 y − y d × [(B)] × f i × , m j i j1 2 (σ j )3 i )2 i +m i ! (x −m m j j 1 j 1 j 2 y − y d × [(B)] × f i × , < x j ≤ m ij 2 2 (σ ij )3 ! (x j −m ij 2 )2 (x j −m ij 1 )2 y − y d × [(A) × f¯i × + (B) × f i × ], m ij 2 < x j i 3 i 3 (σ j )

(σ j )

(26)

220

Fig. 1.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 1, JANUARY 2014

Architecture of IT2NFC-OL.

3) output module; 4) learning module; 5) update module; and 6) exchange module. To accelerate the chip execution speed, the pipeline technique is used. Fig. 1 shows that the whole circuit is divided into two pipelines T1 and T2 , with one clock duration for Ti . The execution of the rule antecedent part is independent of the consequent learning; therefore, it is arranged in one pipeline. The extended output module and its subsequences cannot be executed until the completion of the exchange module; therefore, they are arranged in the second pipeline. The following are detailed introductions to each module. Antecedent Part Module: This module is designed to implement the membership and firing strength functions in (2)–(5). For type-1 FLSs using Gaussian MFs in the antecedent, the membership value is a simply a crisp value. Previous type-1 fuzzy chips have focused on the implementation of Gaussian MFs with the aim of improving function accuracy and/or reducing hardware cost. One approach is to use an LUT to store nonlinear function mapping pairs [26], [36]. Another approach is the approximation of the Gaussian function using piecewise linear functions [31]. For interval type-2 Gaussian MFs, the computation of the interval MFs in (2) and (3) is much more complex than that of a type-1 Gaussian MF. Instead of focusing on the implementation of a Gaussian MF using a new circuit, the antecedent module is dedicated to the implementation of (2) and (3) and their products to find the firing strengths. Fig. 1 shows that M antecedent modules are designed to i compute the upper and lower firing strengths f and f i of all of the M rules in a clock in parallel. The output of the product i operation in f and f i is an exponential function exp(−k), where k is the summation of the exponential terms in the n participating Gaussian membership values. This paper uses adders instead of multipliers to implement (5) for hardware cost reduction, which is another advantage of using Gaussian MFs rather than triangular or trapezoidal MFs, in addition to the higher approximation capability of using Gaussian MFs in NFSs [49]. This module first determines the value of k for each of the two firing strengths and then uses an LUT to find exp (−k). For a given input x j , the exponential terms k ij 1

Fig. 2. Submodule proposed to compute the exponential terms in upper and lower Gaussian MFs.

Fig. 3. Submodule proposed to determine the value of the right exponential term.

and k ij 2 for the two Gaussian MFs in (2) and (3) have to be determined first. These are k 1j 1 = −[(x j − m ij 1) · σˆ ji ]2 , k ij 2 = −[(x j − m ij 2 ) · σˆ ji ]2 (27)

where σˆ ji = 1/σ ji is a precalculated constant to avoid the use of a divider. Fig. 2 shows the submodule proposed to compute k ij 1 and k ij 2 . Each of the terms k ij 1 and k ij 2 is represented by 17 b, where 10 and 7 b are used to represent the integer and decimal parts, respectively. The use of either k ij 1 or k ij 2 for computing μ A˜ i (x j ) and μ A˜ i (x j ), is determined j

j

using the submodule in Fig. 3. The symbols k ij u and k ij l in Fig. 3 represent the exponential terms selected for computing μ A˜ i (x j ) and μ A˜ i (x j ), respectively. On the basis of (3), the j

j

sign bit S, of (m ij 1 + m ij 2 )/2 − x j is sent to a multiplexer

JUANG AND CHEN: NEURAL FUZZY CHIP WITH ON-CHIP INCREMENTAL LEARNING ABILITY

kui [16 :10]

221

wl1[ MSB ]

f l1

f1

f l1

k1iu knui

k

i u

f (111 1)2

i

i u

k [9 : 0]

f

wl1 wlM [ MSB ] f

Fig. 4.

f

fl M

M

fl M

Submodule proposed to implement the upper firing strength.

(MUX) to determine the selection of k ij 1 or k ij 2 as k ij l . It is supposed that X and Y are inputs, and S is the selection signal of the MUX. The two outputs Z 1 and Z 2 of the MUX are given as follows:

X, S = 1 X, S = 0 Z2 = (28) Z1 = Y, S = 0, Y, S = 1.

terml1

1

termlM

M

wlM

Fig. 5.

Left extended output module.

Fig. 6.

Output module.

Fig. 7.

Learning module proposed to implement ∂ E/∂wli and ∂ E/∂wri .

Equations (3) and (4) imply that, if the lower bound term k ij l is equal to k ij 1, then the upper bound term k ij u should be equal to k ij 2 or 0 (because exp(0) = 1). Fig. 3 shows that the

exclusive OR operation of the sign bits of “x j − m ij 2 = a ij 2”

and “x j − m ij 2 = a ij 2 ” is used to determine the selection of i , . . . , k i are added, and k ij 2 or 0. The n exponential terms k1u nu i

the result is sent to an LUT to give the final value of f . i Fig. 4 shows the submodule to implement f . The input and output values in the LUT are represented by 10 and 8 b, respectively. That is, the input kui sent to the LUT is represented by 10 b, where 3 and 7 b are used to represent integer and decimal parts, respectively. The use of 10 b instead of 17 b, as in the k ij u , is meant to reduce the LUT size. When three integer bits are used, the maximum value of kui is very close to 23 . Because exp(−23 ) = 0.000335 and the resolution of the LUT output is 2−8 = 0.0039 . . . , the value exp(−23 ) is represented by zero in the LUT output. For the inputs whose values are greater than 23 , the corresponding LUT outputs are also equal to zero. For this reason, only three integer bits are used. To transform the 17-b k ij u to the 10-b input in the LUT, a MUX circuit based on the bit values in k ij u [16:10] is proposed, as shown in Fig. 4. If the OR operation of all bits in k ij u [16:10] is “0,” indicating that k ij u < 23 , then k ij u [9:0] is sent directly to the LUT. On the contrary, if the OR operation result is “1,” indicating that k ij u > 23 , then kui [9:0] = “1…1,” and the corresponding LUT output is zero. Similar operations give f i .

A. Extended Output Module Fig. 5 shows the left extended output module designed for parallel implementation of the M terms fli wli , i = 1, . . . , M in (9). The sign bit of the consequent value wli is sent to an i MUX to determine whether f l or f li is selected to multiply fli . Similarly, the right extended module computes fri wri . Each output in the extended module is represented by 16 b, where the least significant digit is 2−12 and the most significant digit is a sign bit.

B. Output Module This module is designed to implement the output functions in (9) and (10). Fig. 6 shows that this module first adds the product terms from the extended output module to obtain yl + yr . A right shift circuit then implements the operation of division by 2. C. Learning Module This module is designed to implement ∂ E/∂wli and ∂ E/∂wri in (19) and (20). Fig. 7 shows the circuit for the parallel implementation of ∂ E/∂wli and ∂ E/∂wri . The fli and fri values selected by the MUX circuits in Fig. 5 are multiplied with y − yd to find ∂ E/∂wli and ∂ E/∂wri , respectively. D. Update Module This module computes the updated consequent values w˜ li (t + 1) and w˜ ri (t + 1) in (17) and (18). Fig. 8 shows the left update module designed to compute w˜ li (t + 1). This module first computes the product of the learning rate η(= 2−3 ) with

222

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 1, JANUARY 2014

η

∂E ∂wli

Fig. 9.

TABLE II T RAINING P ERFORMANCES OF D IFFERENT S OFTWARE - I MPLEMENTED T YPE -1 AND T YPE -2 NFSs IN E XAMPLE 1 i l

w (t + 1)

η Fig. 8.

∂E ∂wli

NFSs

i l

w (t )

T2FLS

T2SONFS

Rule number

4

T1-FNN 8

4

4

IT2NFC-OL 4

RMSE

0.022

0.005

0.0152

0.0115

0.0132

Left update module proposed to compute w˜ li (t + 1).

Circuits in the exchange module.

∂ E/∂wli by right-shifting and then adds the result to wli (t). Similarly, the right update module computes the updated consequent value w˜ ri (t + 1). The update of all rule consequent values is executed in parallel. E. Exchange Module Fig. 9 shows the module designed to execute (21). The sign bit of w˜ li − w˜ ri is sent to an MUX to determine whether the exchange is necessary. The exchange operations in all rules are executed in parallel. Fig. 1 shows that the new consequent values are sent back to the extended output module to compute the next IT2NFC-OL output for the next input data. V. S IMULATIONS This section describes four examples of software- and fieldprogrammable gate array (FPGA)-implemented IT2NFC-OL, where the latter is implemented on a Xilinx Virtex-4 XC4vlx60-10ff1148 FPGA device. The electrical design automation software tools used are ModelSim and SE6.5ISE 8.1i. In the following examples, the FPGA-implemented IT2NFC-OL performs on-chip incremental learning for each newly available input data using only one learning iteration, i.e., without iterative learning. To observe the effect of the onchip learning ability of the FPGA-implemented IT2NFC-OL, FPGA implementation of the IT2NFC with fixed parameters (called IT2NFC-C), i.e., without on-chip learning ability, is applied to the same problems for comparison. A. Online Data Sequence Prediction Example 1 (Time-Varying Sequence Prediction): This example uses the IT2NFC-OL to predict the output of a data stream generated by a nonlinear process given by y p (t + 1) =

y p (t) + sin3 (2πt/100) 1 + y 2p (t)

(29)

where y p (1) = 0. The inputs of IT2NFC-OL are y p (t) and sin3 (2πt/100), and the desired output y d (t + 1) is equal to

y p (t + 1). Performance is evaluated using root-mean-squared error (RMSE) between the IT2NFC-OL output and y p (t + 1). The training data contain 2000 samples (t = 1, . . . , 2000). Training was performed for 100 iterations. Four rules were generated when threshold φth was set to 0.07. Table II shows the training error of the software-implemented IT2NFC-OL. For the purpose of comparison, Table II also shows the performances of two Mamdani-type interval type-2 FLSs, the type-2 FLS (T2FLS) [18] and T2SONFS [20], using the same number of four rules. The T2FLS uses a fixed and preassigned structure. The number of fuzzy sets in the T2FLS was equal to the number of rules, and the initial fuzzy sets were uniformly distributed in the input domain. The number of rules in the T2FLS was assigned to be the same as that in the IT2NFC-OL so that both have the same model size. The IT2NFC-OL shows a smaller error than the T2FLS. The T2SONFS shows a smaller error than the IT2NFC-OL; however, the former achieves this advantage at the cost of using the much more complex K–M iterative procedure for the typereduction operation. To test the online learning ability of the IT2NFC-OL, the process is assumed to be time variant after the time step at t = 2000 y (t + 1) = ⎧d " yd (t) ⎪ ⎪ + sin3 (2πt 100) + 0.1, 2000 < t ≤ 4000 ⎨ 2 1 + yd (t) (30) ⎪ yd (t) + sin3 (2πt "100) + 0.2, 4000 < t ≤ 6000. ⎪ ⎩ 1 + yd2 (t) Table III shows the test prediction errors of the softwareimplemented IT2NFC-OL and IT2NFC-C, where the former online tunes its consequent parameters for each incoming data as the FPGA-implemented IT2NFC-OL. The error of the software-implemented IT2NFC-OLs is smaller than its corresponding IT2NFC-C implementations, verifying the online learning ability. The software-designed IT2NFC-OL was implemented on an FPGA chip. Table IV shows that maximum execution speed of the FPGA-implemented IT2NFC-OL is a 29.58-MHz clock frequency, and the total size is 29 311 gate counts. To test the effectiveness of the FPGA-implemented IT2NFC-OL, we fed the inputs to and read the outputs from the FPGA chip using VeriComm software in a personal computer (PC) with a USB interface. Table III shows the test errors of the FPGA-implemented IT2NFC-OL and IT2NFC-C. The errors from both chips are close to the errors of their corresponding software implementations with only a slight degradation caused mainly by the lower resolution in the hardware compared with the software. The FPGA-implemented

JUANG AND CHEN: NEURAL FUZZY CHIP WITH ON-CHIP INCREMENTAL LEARNING ABILITY

TABLE III O NLINE L EARNING P ERFORMANCES OF D IFFERENT S OFTWARE - AND

TABLE IV E XECUTION S PEED AND S IZE OF THE IT2NFC-OL

FPGA-I MPLEMENTED NFSs IN E XAMPLE 1 Models Rule number 2001∼4000 (RMSE) 4001∼6000 (RMSE)

T1-FNN (Software)

FOR

IT2NFC-C IT2NFC-OL IT2NFC-C IT2NFC-OL (FPGA)

(FPGA)

4

4

4

4

0.0509 0.0543

0.0997

0.0335

0.1005

0.0351

0.0720 0.1155

0.2070

0.0584

0.2074

0.0584

4

8

(Software) (Software)

223

Rule number Input dim. Speed (Hz) No. of gate counts

D IFFERENT E XAMPLES

Example 1 4 2 29.58 M 29 311

Example 2 4 4 29.58 M 40 345

Example 3 4 2 28.26 M 32 298

IT2NFC-C IT2NFC-OL

1.5

d

magnitude

1

y

0.5 0 -0.5 -1

d

y

-1.5 5800

0

5850

5900

5950

6000

time step (t)

Fig. 11. Test results (t = 5801, . . . , 6000) of the FPGA-implemented IT2NFC-C and IT2NFC-OL in Example 1.

-1 3800

magnitude

1

IT2NFC-C IT2NFC-OL

Example 4 4 3 28.26 M 41 454

3850

3900

3950

4000

time step (t) Fig. 10. Test results (t = 3801, . . . , 4000) of the FPGA-implemented IT2NFC-C and IT2NFC-OL in Example 1.

IT2NFC-OL shows a smaller error than the IT2NFC-C, which verifies the on-chip learning ability. Figs. 10 and 11 show the online prediction results using the FPGA-implemented IT2NFC-C and IT2NFC-OL. It is hard to tell the difference between two different outputs if the prediction results of the 2000 samples in each time period in (30) are all plotted in the same figure. For clarity, only the last 200 prediction results of each time period are shown in Figs. 10 and 11. The results show that there is an obvious error between the IT2NFC-C and the actual process outputs due to the time-varying property of the process. Because of its on-chip parameter learning ability, the outputs of the IT2NFC-OL chip are very close to the process outputs. For the purpose of comparison, Tables II and III also show the training and online learning performances, respectively, of a type-1 fuzzy neural network (T1-FNN) [50], which is also characterized by online incremental structure and parameter learning ability. The results show that the errors of the T1-FNN are approximately twice those of the IT2NFC-OL when both use the same number of four rules. Because the number of antecedent and consequent parameters in each rule of the IT2NFC-OL is about twice that of the T1-FNN, Tables II and III also show the performances of the T1-FNN with twice the number of rules in the IT2NFC-OL. While Table II shows that the T1-FNN using eight rules achieves a smaller training error than the T2NFC-OL, Table III shows that the online test error of the former is larger than that of the latter. The explanation is that the larger number of rules (i.e., a larger model size) limits online parameter learning ability in changing environments. Example 2 (Chaotic Sequence Prediction): This example studies the prediction of a data sequence generated by the

following Mackey–Glass chaotic process [9], [51]–[55]: a · x(t − τ ) d x(t) = − 0.1x(t) (31) dt b + x 10 (t − τ ) where τ > 17, a = 0.2, and b = 1. The parameter τ was set to be 30, and x(0) = 1.2. Four past values were used to predict x(t), and the input-output data format was [x(t − 24), x(t − 18), x(t − 12), x(t − 6); x(t)]. One thousand patterns were generated from t = 124 to t = 1123, with the first 500 patterns used for training and the last 500 for testing. Training was performed for 5000 iterations. Four rules were generated when φth was set to be 0.26. Table V shows the test RMSEs of software-implemented IT2NFC-C and IT2NFC-OL, where the latter shows a smaller test error than the former due to online consequent parameter learning. For the purpose of comparison, Table V shows the test RMSEs of T2FLS, T2SONFS, and T1-FNN using different numbers of rules, where the model parameters are fixed during each test, as in the IT2NFC-C. Table V also shows the test RMSEs of three Mamdani-type type-1 FLSs that were applied to the same problem, including a neuro-fuzzy function approximator (NEFPROX) [51], a self-organizing fuzzy modified least-square (SOFMLS) network [52], and a weighted fuzzy rule interpolation method (WFRI) [54]. The results show that the IT2NFC-C has a smaller test error than the type-1 and type-2 fuzzy models used for comparison. The software-designed fuzzy rules were implemented on an FPGA chip. Table IV shows the maximum execution speed and size of the FPGA-implemented IT2NFC-OL. Although a larger number of inputs are fed to the chip in this example than in Example 1, the speed is the same as that in Example 1 due to the parallel processing of different inputs. Table V shows the test RMSEs of FPGA-implemented IT2NFC-C and IT2NFCOL, where the latter shows a smaller prediction error than the former, which verifies the effectiveness of the on-chip learning ability. Fig. 12 shows the prediction outputs of these two chips, where it is observed that there are larger prediction errors at

224

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 1, JANUARY 2014

TABLE V T EST P ERFORMANCES OF D IFFERENT S OFTWARE - AND FPGA-I MPLEMENTED FLSs IN E XAMPLE 2 NEFPROX [51]

SOFMLS [52]

WFRI [54]

Rule number

26

7

68

4

RMSE

0.0533

0.0471

0.0611

0.0120

FLSs

T1-FNN (Software)

T2FLS (Software)

T2SONFS (Software)

IT2NFC-C (Software)

IT2NFC-OL (Software)

IT2NFC-C (FPGA)

IT2NFC-OL (FPGA)

8

4

4

4

4

4

4

0.0070

0.0127

0.0071

0.0065

0.0062

0.0152

0.0095

IT2NFC-C IT2NFC-OL desired output

1.4

magnitude

1.2 1 0.8 0.6 0.4

0

100

200

sample 300

400

500

Fig. 12. Test results of the FPGA-implemented IT2NFC-C and IT2NFC-OL from t = 624 to t = 1123 in Example 2. desired output IT2NFC-OL IT2NFC-C

magnitude

1 0.8 0.6

0.4 0

200

400

600

800

1000

samples Fig. 13. Test results of the software-implemented IT2NFC-C and IT2NFC-OL from t = 624 to t = 1623 in Example 2, where the environment changes at t = 1124.

the outputs with abrupt changes when using the IT2NFC-C, and the on-chip parameter learning in the IT2NFC-OL helps to reduce the errors. This example also studies the online learning performance of the software-implemented IT2NFC-OL when the process function (4) changes after t = 1124. It is assumed that the values of the parameters a and b change to 0.1 and 0.5, respectively. Fig. 13 shows the test prediction results of IT2NFC-C and IT2NFC-OL from t = 624 to t = 1623, where it is found that prediction outputs of the IT2NFC-OL were much closer to the actual outputs than the IT2NFC-C when the environment changes after t = 1124. The results verify the online learning ability of the IT2NFC-OL. B. Online System Control The FPGA-implemented IT2NFC-OL is applied to online system control in this section. The plant is assumed to be governed by the following difference equation: y p (t + 1) = f (y p (t), . . . , y p (t − n)) + u(t)

(32)

where y p (t + 1) is the plant output, u(t) is the control input, and f (·) is an unknown function. This is a simple nonlinear

Fig. 14. controller.

Direct adaptive control configuration using the IT2NFC-OL

system used to demonstrate the on-chip learning ability of the IT2NFC-OL. Most real systems are more complex than the plant considered. Fig. 14 shows the control configuration and the input-output variables of the IT2NFC-OL, where the direct adaptive control configuration [56] is used and y d is the desired output. The inputs of the IT2NFC-OL controller are y d (t + 1), y p (t), . . . , y p (t − n), and the controller sends a control signal u(t) to control the plant. Unlike the prediction problems in Examples 1 and 2, the desired outputs (control signals) to the IT2NFC-OL in this controller design problem are unavailable during training. The IT2NFC-OL is trained to minimize the error between y d (t + 1) and the controlled plant output y p (t + 1), i.e., the error function is defined as 1 (y p (t + 1) − y d (t + 1))2 . (33) 2 Let θ denote one of the free parameters in the IT2NFC-OL trained to minimize the error function in (33) using the gradient descent learning algorithm. That is ∂y p (t + 1) ∂E = (y p (t + 1) − y d (t + 1)) (34) ∂θ ∂θ where ∂y p (t + 1) ∂y p (t + 1) ∂u(t) ∂u(t) = · = (35) ∂θ ∂u(t) ∂θ ∂θ where the result of the derivation of the term ∂u(t)/∂θ is the same as those derived in Section III-B. Example 3 (Online Control of a Time-Varying Plant): The controlled plant is governed by the following difference equation: y p (t)(y p (t) + a(t)) + u(t) (36) y p (t + 1) = 1 + y 2p (t) E(t + 1) =

where y p (0) = −6 and a(t) is a time-varying parameter given by

2.5, 1 ≤ t ≤ 500 a(t) = (37) 2.8, 500 < t.

JUANG AND CHEN: NEURAL FUZZY CHIP WITH ON-CHIP INCREMENTAL LEARNING ABILITY

225

TABLE VI T EST C ONTROL P ERFORMANCES OF D IFFERENT S OFTWARE - AND FPGA-I MPLEMENTED NFSs IN E XAMPLE 3 T1-FNN (Software)

IT2NFC-OL (Software)

T1-FNN (Software)

Rule number

3–17 (Ave = 7.5)

2–6 (Ave = 3.9)

4

RMSE

0.0390

0.0353

0.0347

NFSs

IT2NFC-C (Software)

IT2NFC-OL (Software)

IT2NFC-C (FPGA)

IT2NFC-OL (FPGA)

8

4

4

4

4

0.0340

0.0924

0.0270

0.0883

0.0283

magnitude

0.5 0

-0.5 400

600

800

1000

time step (t)

1200

1400

1600

Fig. 15. Test control results (250 < t ≤ 1750) using the FPGAimplemented IT2NFC-C (— -), IT2NFC-OL (—), and the desired outputs (- -) in Example 3.

The desired trajectory is given by the following 250 pieces of data: " y d (t + " 1) = 0.6 sin(2πt" 45), 1 ≤ t ≤ 110 0.2 sin(2πt 25) + 0.4 sin(πt 32), 110 < t ≤ 250. (38) The online training of the software-implemented IT2NFC-OL was performed on these 250 pieces of data for 1000 iterations. Because the desired outputs to the IT2NFC-OL are unavailable during training, the initial consequent parameters are randomly generated, which results in the generation of different online training data during the control period and, therefore, different numbers of online generated rules in different runs. When φth was set to be 0.01, the generated number of rules ranged from 2 to 6, with an average of 3.9, for 10 different runs. To test the performance of the designed controller, another desired trajectory was given by " " y d (t +1) = 0.2 sin(2πt 25)+0.4 sin(πt 32), t > 250. (39) Table VI shows the average control RMSEs of the softwareimplemented IT2NFC-OL in the online test control period 250 < t ≤ 1750 over 10 runs. For the purpose of comparison, Table VI also shows the test performance of T1-FNN with online parameter learning from 10 different runs. The number of rules ranged from 3 to 17, with an average of 7.5, for 10 different runs. The result shows that the number of rules generated in the T1-FNN is much more sensitive to the initial consequent parameter values than in the IT2NFC-OL. Table VI also shows the test performances of the software-designed IT2NFC-C and IT2NFC-OL with four rules and the T1-FNN with four and eight rules. The average test error of the IT2NFC-OL is smaller than that of the T1-FNN, although the latter uses approximately twice the number of rules. The software-designed IT2NFC-C and IT2NFC-OL with four rules were implemented on an FPGA chip. To test the effectiveness of the FPGA-implemented IT2NFC-OL controller, the controlled plant was simulated in a PC.

The FPGA chip communicates with the PC by software development kits and USB. Table IV shows the maximum execution speed and size of the FPGA-implemented IT2NFC-OL. The speed is slightly slower and the size is slightly larger than those in Example 1 because of the additional inclusion of the peripheral circuits for communication between the chip and the PC. Table VI shows the control RMSEs of the FPGA-implemented IT2NFC-C and IT2NFC-OL in the test period 250 < t ≤ 1750, where the RMSEs of both chips are close to those of their software corresponding implementations. The RMSE of the FPGA-implemented IT2NFC-OL is smaller than that of the IT2NFC-C, which verifies the on-chip learning ability of the FPGA controller. Fig. 15 shows the control results of the FPGA-implemented IT2NFC-OL and IT2NFC-C, where it is observed that a significant increase in the control error occurs at the time t = 500 because of the change of the parameter a(t). In contrast to the FPGA-implemented IT2NFC-C, the control error of the FPGA-implemented IT2NFC-OL tends to decrease because of its on-chip learning ability. Example 4 (Online Dynamic Plant Control): The dynamic plant to be controlled is described by the following equation [12]: y p (t)y p (t − 1)(y p (t) + 2.5) + u(t). (40) y p (t + 1) = 1 + y 2p (t) + y 2p (t − 1) As in [12], the desired output trajectory y d (t + 1) is given by the following 200 pieces of data: ⎧ 10, 0 < t ≤ 50 ⎪ ⎪ ⎨ 15, 50 < t ≤ 100 y d (t + 1) = (41) 10, 100 < t ≤ 150 ⎪ ⎪ ⎩ 15, 150 < t ≤ 200. Online training was performed on these 200 pieces of data for 1000 iterations. As in Example 3, different training data and rule numbers were generated because of the random initial consequent parameters in the IT2NFC-OL for different runs. When φth was set to be 0.18, the number of rules generated ranged from 4 to 9, with an average of 6.1, for 10 different runs. Table VII shows the average control performance of the software-implemented IT2NFC-OL. For the purpose of comparison, Table VII also shows the control RMSEs using T1-FNN with online learning during control, where the number of rules generated ranged from 4 to 19, with an average of 8.9. As in Example 3, the T1-FNN shows both a higher variation in rule number and a larger average control RMSE and rule number than the IT2NFC-OL. Table VII also shows the test performances of the softwaredesigned IT2NFC-C and IT2NFC-OL with four rules for the desired output trajectory. The two models were implemented

226

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 1, JANUARY 2014

TABLE VII T EST C ONTROL P ERFORMANCES OF D IFFERENT S OFTWARE - AND FPGA-I MPLEMENTED NFSs IN E XAMPLE 4 NFSs Rule number RMSE

T1-FNN (Software) 4–19 (Ave = 8.9) 0.09050

IT2NFC-OL (Software) 4–9 (Ave = 6.1) 0.0307

IT2NFC-C (Software)

IT2NFC-OL (Software)

IT2NFC-C (FPGA)

4

4

4

4

0.00243

0.00241

0.2154

0.0467

TABLE VIII

magnitude

15

C ONTROL R ESULTS U SING D IFFERENT T YPE -1 AND T YPE -2 F UZZY C ONTROLLERS IN E XAMPLE 4

10 IT2NFC-C IT2NFC-OL desired output

5 0

IT2NFC-OL (FPGA)

0

50

100

time step (t)

150

200

Rule number RMSE

Type-1 FNS [12] 9 0.1476

Type-2 TSK FNS [12] 9 0.0725

IT2NFC-C (Software) 4 0.00243

(a)

magnitude

15 10 IT2NFC-C IT2NFC-OL desired output

5 0

0

50

100

150

200

time step (t) (b) Fig. 16. Control results using the (a) software- and (b) FPGA-implemented IT2NFC-C and IT2NFC-OL in Example 4.

on an FPGA chip. Table IV shows the maximum execution speed and size of the FPGA-implemented IT2NFC-OL. The size of the chip is larger than that in Example 2, although a smaller number of inputs are fed to the chip because of the additional inclusion peripheral circuits for communication between the chip and the PC, as in Example 3. Table VII shows the control RMSEs using the software- and FPGAimplemented IT2NFC-C and IT2NFC-OL with four rules. Fig. 16 shows the control results of the software- and FPGAimplemented IT2NFC-C and IT2NFC-OL. It is observed from this figure that, in contrast to the software implementation, a larger control error occurs at the desired regulation point y d (t + 1) = 10 when using the FPGA-implemented IT2NFC-C controller. The error is reduced as a result of the on-chip learning ability when using the FPGA-implemented IT2NFC-OL. For the purpose of comparison, Table VIII shows the best control results of the same dynamic plant and desired trajectory using different software-implemented type-1 and type-2 fuzzy controllers with the direct inverse control configuration reported in [12]. The fuzzy controllers are a type-1 Takagi–Sugeno–Kang (TSK) fuzzy neural system (FNS) and a type-2 TSK FNS. The results show that the proposed IT2NFC-C-based control approach achieves not only a smaller control error but also a smaller number of rules than the two FNS controllers. VI. D ISCUSSION This section compares the implementation circuits in the IT2NFC-OL with various interval type-2 fuzzy chips,

including the pro-two [38], [40], H-T2SONFS [20], and the type-2 fuzzy inference processor (T2FIC) [39]. Table IX shows the reported performances of different chips, where the computation speed is represented in terms of millions of type-2 fuzzy inferences per second (MT2FIPS). For the IT2NFC-OL, the MT2FIPS includes the additional learning of the consequent parameters. The major characteristic of the IT2NFC-OL in contrast to these chips is its on-chip learning ability. In addition to this difference, other features of the IT2NFC-OL (and IT2NFC-C) are discussed as follows. In the antecedent part of the IT2NFC-OL, only the entries in one LUT have to be built for the computation of all upper and lower rule firing strengths. The same LUT can be used in different FLSs without redesign. A different technique based on LUTs for fuzzification was proposed in pro-two and H-T2SONFS. In these two studies, 2M LUTs are built to store the upper and lower membership values of M interval type-2 MFs. The values of the entries in these 2M LUTs are different and have to be reassigned for a new FLS, which burdens the design effort. The proposed approach simplifies the LUT design effort, especially when M is large. For the i given inputs wli , wri , f , and f i , Table X shows the number of additions, multiplications, and divisions used to compute the outputs yl and yr in the K–M iterative procedure, the pro-2, the H-T2SONFS, and the IT2NFC-OL (IT2NFC-C). For the K–M iterative procedure, the cost shown is that when the maximum number of M iterations is performed. The pro-two uses the Wu–Mendel closed-form approach [23] to compute the output. The result shows that the numbers of additions, multiplications, and divisions in the IT2NFC-OL are much smaller than those of the K–M iterative procedure and the Wu–Mendel closed form and the same as those of the H-T2SONFS. In the H-T2SONFS, two additional LUTs are used to store the left and right crossover points for all possible input combinations. The analysis above does not consider learning of the consequent parameters. The hardware implementation costs are further reduced in the IT2NFC-OL when parameter learning is considered. The LUT technique used in H-T2SONFS is not feasible when parameter learning is considered. In the T2FIC, a different type reduction operation with a center-of-gravity operation is used and the result is approximated using eight embedded type-1 fuzzy sets.

JUANG AND CHEN: NEURAL FUZZY CHIP WITH ON-CHIP INCREMENTAL LEARNING ABILITY

227

TABLE IX C OMPARISONS OF D IFFERENT I NTERVAL T YPE -2 F UZZY C HIPS Chips On-chip learning Interval type-2 MF Input/output dim. Rule number t-norm MT2FIPS Gate counts

T2FIC No Trapezoid 2/1 64 Minimum 3.125 470 000

Pro-Two No Triangle 2/1 9 Minimum 33.3 NA

TABLE X H ARDWARE I MPLEMENTATION C OSTS OF D IFFERENT A PPROACHES FOR C OMPUTING THE T WO B OUNDARY P OINTS IN THE S YSTEM E XTENDED O UTPUT W ITH G IVEN F IRING S TRENGTH VALUES Approaches

K–M iterative procedure

Pro-Two

H-T2SONFS IT2NFC-C

4M(M − 1)

10(M − 1) + 6

2(M − 1)

2(M − 1)

Multiplications

2M 2

8M + 9

2M

2M

Divisions

2M

4

0

0

0

0

2

0

Additions

LUT number

Table IX shows that the computation speed of the T2FIC is much lower than those of the other chips. Table IX shows that the computation speeds of the IT2NFC-C with 4 and 10 rules are very similar because of parallelism in computation. The result shows that the computation speed of the IT2NFC-C is faster than that of the pro-two and T2FIC. The H-T2SONFS shows higher computation speed than the IT2NFC-C because of the use of 2M LUTs for the fuzzification operation and the minimum t-norm operation for firing strength computation. VII. C ONCLUSION This paper proposed the FPGA-implemented IT2NFC-OL for applications in changing environments that demand high computation speed. Parameter learning in general interval type-2 NFSs is computationally expensive. The introduction of a simplified type-reduction operation into an interval type-2 NFS reduced the hardware cost and makes the implementation of on-chip incremental parameter learning ability via hardware practical. On the basis of the structure and parameter learning, the simulation results showed that the software-implemented IT2NFC-OL has a performance competitive with its corresponding type-2 fuzzy model using a general type-reduction operation. The on-chip learning ability reduced the accuracy degradation between the software and hardware implementations and made it feasible to apply the FPGA-implemented IT2NFC-OL to online prediction and control problems. These characteristics were verified through the results of different examples. In the future, the implementation circuits in the IT2NFC-OL may be incorporated into system-on-chip (SoC), so that the interval type-2 NFS and the other parts of the system are implemented in the same chip, and the SoC can be applied to specific areas requiring high speed or low power consumption.

H-T2SONFS No Gaussian 2/1 4 Minimum 54 12 839

IT2NFC-C No Gaussian 2/1 4 10 Product 40.92 40.73 13 070 21 982

IT2NFC-OL Yes Gaussian 2/1 4 Product 29.58 29 311

R EFERENCES [1] W. L. Tung and C. Quek, “eFSM-A novel online neural-fuzzy semantic memory model,” IEEE Trans. Neural Netw., vol. 21, no. 1, pp. 136–157, Jan. 2010. [2] W. Y. Cheng and C. F. Juang, “An incremental support vector machinetrained TS-type fuzzy system for on-line classification problems,” Fuzzy Sets Syst., vol. 163, no. 1, pp. 24–44, Jan. 2011. [3] A. Penalver and F. Escolano, “Entropy-based incremental variational bayes learning of Gaussian mixtures,” IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 3, pp. 534–540, Mar. 2012. [4] C. F. Juang and C. T. Lin, “An on-line self-constructing neural fuzzy inference network and its applications,” IEEE Trans. Fuzzy Syst., vol. 6, no. 1, pp. 12–32, Feb. 1998. [5] P. P. Angelov and D. P. Filev, “An approach to online identification of Takagi-Sugeno fuzzy models,” IEEE Trans. Syst., Man Cybern., B, Cybern., vol. 34, no. 1, pp. 484–498, Feb. 2004. [6] J. D. Rubio, “SOFMLS: Online self-organizing fuzzy modified least-squares network,” IEEE Trans. Fuzzy Syst., vol. 17, no. 6, pp. 1296–1309, Dec. 2009. [7] Y. Y. Lin, J. Y. Chang, and C. T. Lin, “Identification and prediction of dynamic systems using an interactively recurrent self-evolving fuzzy neural network,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 2, pp. 310–321, Feb. 2013. [8] J. M. Mendel, Uncertain Rule-Based Fuzzy Logic System: Introduction and New Directions. Upper Saddle River, NJ, USA: Prentice-Hall, 2001. [9] C. F. Juang and Y. W. Tsao, “A self-evolving interval type-2 fuzzy neural network with on-line structure and parameter learning,” IEEE Trans. Fuzzy Syst., vol. 16, no. 6, pp. 1411–1424, Dec. 2008. [10] C. F. Juang and C. H. Hsu, “Reinforcement ant optimized fuzzy controller for mobile-robot wall-following control,” IEEE Trans. Ind. Electron., vol. 56, no. 10, pp. 3931–3940, Oct. 2009. [11] J. R. Castro, O. Castillo, P. Melin, and A. Rodriguez-Diaz, “A hybrid learning algorithm for a class of interval type-2 fuzzy neural networks,” Inf. Sci., vol. 179, no. 13, pp. 2175–2193, 2009. [12] R. H. Abiyev and O. Kaynak, “Type 2 fuzzy neural structure for identification and control of time-varying plants,” IEEE Trans. Ind. Electron., vol. 57, no. 12, pp. 4147–4159, Dec. 2010. [13] T. Dereli, A. Baykasoglu, K. Altun, A. Durmusoglu, and I. B. Turksen, “Industrial applications of type-2 fuzzy sets and systems: A concise review,” Comput. Ind., vol. 62, no. 2, pp. 125–137, 2011. [14] O. Castillo, R. M. Marroquin, P. Melin, F. Valdez, and J. Soria, “Comparative study of bio-inspired algorithms applied to the optimization of type-1 and type-2 fuzzy controllers for an autonomous mobile robot,” Inf. Sci., vol. 192, pp. 19–38, Jun. 2012. [15] O. Castillo and P. Melin, “A review on the design and optimization of interval type-2 fuzzy controllers,” Appl. Soft Comput., vol. 12, no. 4, pp. 1267–1278, 2012. [16] D. Hidalgo, P. Melin, and O. Castillo, “An optimization method for designing type-2 fuzzy inference systems based on the footprint of uncertainty using genetic algorithms,” Expert Syst. Appl., vol. 39, no. 4, pp. 4590–4598, 2012. [17] C. H. Hsu and C. F. Juang, “Evolutionary robot wall-following control using type-2 fuzzy controller with species-DE activated continuous ACO,” IEEE Trans. Fuzzy Syst., vol. 21, no. 1, pp. 100–112, Feb. 2013. [18] J. M. Mendel, “Computing derivatives in interval type-2 fuzzy logic system,” IEEE Trans. Fuzzy Syst., vol. 12, no. 1, pp. 84–98, Feb. 2004. [19] O. Uncu and I. B. Turksen, “Discrete interval type-2 fuzzy system models using uncertainty in learning parameters,” IEEE Trans. Fuzzy Syst., vol. 15, no. 1, pp. 90–106, Feb. 2007. [20] C. F. Juang and Y. W. Tsao, “A type-2 self-organizing neural fuzzy system and its FPGA implementation,” IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 38, no. 6, pp. 1537–1548, Dec. 2008.

228

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 25, NO. 1, JANUARY 2014

[21] D. Wu and W. W. Tan, “Computationally efficient type-reduction strategies for a type-2 fuzzy logic controller,” in Proc. IEEE Int. Conf. Fuzzy Syst., Reno, NV, USA, May 2005, pp. 353–358. [22] S. Coupland and R. I. John, “Geometric type-1 and type-2 fuzzy logic systems,” IEEE Trans. Fuzzy Syst., vol. 15, no. 1, pp. 3–15, Feb. 2007. [23] H. Wu and J. M. Mendel, “Uncertainty bounds and their use in the design of interval type-2 fuzzy logic systems,” IEEE Trans. Fuzzy Syst., vol. 10, no. 5, pp. 622–639, Oct. 2002. [24] Q. Liang and J. M. Mendel, “Equalization of nonlinear time-varying channels using type-2 fuzzy adaptive filters,” IEEE Trans. Fuzzy Syst., vol. 8, no. 5, pp. 551–563, Oct. 2000. [25] M. H. Lim and Y. Takefuji, “Implementing fuzzy rule-based systems on silicon chips,” IEEE Expert, vol. 5, no. 1, pp. 31–45, Feb. 1990. [26] M. J. Patyra, J. L. Grantner, and K. Koster, “Digital fuzzy logic controller: Design and implementation,” IEEE Trans. Fuzzy Syst., vol. 4, no. 4, pp. 439–459, Nov. 1996. [27] G. Ascia, V. Catania, and M. Russo, “VLSI hardware architecture for complex fuzzy system,” IEEE Trans. Fuzzy Syst., vol. 7, no. 5, pp. 553–569, Oct. 1999. [28] M. Mckenna and B. M. Wilamowshi, “Implementing a fuzzy system on a field programmable gate array,” in Proc. Int. Joint Conf. Neural Netw., Jul. 2001, pp. 189–194. [29] V. Salapura, “A fuzzy RISC processor,” IEEE Trans. Fuzzy Syst., vol. 8, no. 6, pp. 781–790, Dec. 2000. [30] C. F. Juang and J. S. Chen, “Water bath temperature control by a recurrent fuzzy controller and its FPGA implementation,” IEEE Trans. Ind. Electron., vol. 53, no. 3, pp. 941–949, Jun. 2006. [31] K. Basterretxea, J. M. Tarela, and I. del Campo, “Digital Gaussian membership function circuit for neuro-fuzzy hardware,” Electron. Lett., vol. 42, no. 1, pp. 44–46, Jan. 2006. [32] Q. Cao, M. H. Lim, J. H. Li, Y. S. Ong, and W. L. Ng, “A context switchable fuzzy inference chip,” IEEE Trans. Fuzzy Syst., vol. 14, no. 4, pp. 552–567, Aug. 2006. [33] S. Sanchez-Solano, A. J. Cabrera, I. Baturone, F. J. Moreno-Velo, and M. Brox, “FPGA implementation of embedded fuzzy controllers for robotic applications,” IEEE Trans. Ind. Electron., vol. 54, no. 4, pp. 1937–1945, Aug. 2007. [34] O. Montiel, J. Camacho, R. Sepúlveda, and O. Castillo, “Embedding a fuzzy locomotion pose controller for a wheeled mobile robot into an FPGA,” in Soft Computing for Intelligent Control and Mobile Robotics. New York, NY, USA: Springer-Verlag, 2011, pp. 465–481. [35] J. M. Jou, P. Y. Chen, and S. F. Yang, “An adaptive fuzzy logic controller: Its VLSI architecture and applications,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 8, no. 1, pp. 52–60, Feb. 2000. [36] C. F. Juang and C. H. Hsu, “Temperature control by chip-implemented adaptive recurrent fuzzy controller designed by evolutionary algorithm,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52, no. 11, pp. 2376–2384, Nov. 2005. [37] I. del Campo, J. Echanobe, G. Bosque, and J. M. Tarela, “Efficient hardware/software implementation of an adaptive neuro-fuzzy system,” IEEE Trans. Fuzzy Syst., vol. 16, no. 3, pp. 761–778, Jun. 2008. [38] M. A. Melgarejo, R. A. Garcia, and C. A. Pena-Reyes, “Pro-two: A hardware based platform for real time type-2 fuzzy inference,” in Proc. IEEE Int. Conf. Fuzzy Syst., vol. 2. Jul. 2004, pp. 977–982. [39] S. H. Huang and Y. R. Chen, “VLSI implementation of type-2 fuzzy inference processor,” in Proc. IEEE Int. Symp. Circuits Syst., vol. 4. May 2005, pp. 3307–3310. [40] M. A. Melgarejo and C. A. Pena-Reyes, “Implementing interval type-2 fuzzy processors [Developmental Tools],” IEEE Comput. Intell. Mag., vol. 2, no. 1, pp. 63–71, Feb. 2007. [41] R. Sepúlveda, O. Montiel, O. Castillo, and P. Melin, “Embedding a high speed interval type-2 fuzzy controller for a real plant into an FPGA,” Appl. Soft Comput., vol. 12, no. 3, pp. 988–998, 2012. [42] R. Sepúlveda, O. Montiel, G. Lizárraga, and O. Castillo, “Modeling and simulation of the defuzzification stage of a type-2 fuzzy controller using the Xilinx system generator and simulink,” in Evolutionary Design of Intelligent Systems in Modeling, Simulation and Control. New York, NY, USA: Springer-Verlag, 2009, pp. 309–325. [43] I. Zliobaite, “Learning under concept drift: An overview,” Faculty of Mathematics and Informatics, Vilnius Univ., Vilnius, Lithuania, Tech. Rep. 2009, 2010, pp. 1–36. [44] R. Elwell and R. Polikar, “Incremental learning of concept drift in non stationary environments,” IEEE Trans. Neural Netw., vol. 22, no. 10, pp. 1517–1531, Oct. 2011. [45] W. Yu, F. Ortiz, and M. A. Moreno, “Hierarchical fuzzy CMAC for nonlinear systems modeling,” IEEE Trans. Fuzzy Syst., vol. 16, no. 5, pp. 1302–1314, Oct. 2008.

[46] X. Ren and X. Lv, “Identification of extended Hammerstein systems using dynamic self-optimizing neural networks,” IEEE Trans. Neural Netw., vol. 22, no. 8, pp. 1169–1179, Aug. 2011. [47] J. J. Rubio, P. Angelov, and J. Pacheco, “Uniformly stable backpropagation algorithm to train a feedforward neural network,” IEEE Trans. Neural Netw., vol. 22, no. 3, pp. 356–366, Mar. 2011. [48] W. Zhang, W. Wu, M. Yao, “Boundedness and convergence of bath backpropagation algorithm with penalty with feedforward neural networks,” Neurocomputing, vol. 89, pp. 141–146, May 2012. [49] K. Basterretxea, J. M. Tarela, and I. del Campo, “Consequences of Gaussian function approximation in the performance of neuro-fuzzy systems,” in Proc. 4th Int. Conf. Recent Adv. Soft Comput., Dec. 2002, pp. 313–318. [50] C. F. Juang, T. C. Chen, and W. Y. Cheng, “Speedup of implementing fuzzy neural networks with high-dimensional inputs through parallel processing on graphic processing units,” IEEE Trans. Fuzzy Syst., vol. 19, no. 4, pp. 717–728, Aug. 2011. [51] D. Nauk and R. Kruse, “Neuro-fuzzy systems for function approximation,” Fuzzy Sets Syst., vol. 101, no. 2, pp. 261–271, 1999. [52] J. D. J. Rubio, “SOFMLS: Online self-organizing fuzzy modified least squares network,” IEEE Trans. Fuzzy Syst., vol. 17, no. 6, pp. 1296–1309, Dec. 2009. [53] C. F. Juang, C. M. Hsiao, and C. H. Hsu, “Hierarchical cluster-based multi-species particle swarm optimization for fuzzy system optimization,” IEEE Trans. Fuzzy Syst., vol. 18, no. 1, pp. 14–26, Feb. 2010. [54] S. M. Chen and Y. C. Chang, “Weighted fuzzy rule interpolation based on GA-based weight-learning techniques,” IEEE Trans. Fuzzy Syst., vol. 19, no. 4, pp. 729–744, Aug. 2011. [55] A. Miranian and M. Abdollahzade, “Developing a local least-squares support vector machines-based neuro-fuzzy model for nonlinear and chaotic time series prediction,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 2, pp. 207–218, Feb. 2013. [56] K. S. Narendra and K. Parthasarathy, “Identification and control of dynamical systems using neural networks,” IEEE Trans. Neural Netw., vol. 1, no. 1, pp. 4–27, Mar. 1990.

Chia-Feng Juang (M’99–SM’08) received the B.S. and Ph.D. degrees in control engineering from National Chiao-Tung University, Hsinchu, Taiwan, in 1993 and 1997, respectively. He has been with the Department of Electrical Engineering, National Chung-Hsing University, Taichung, Taiwan, since 2001, where he became a Full Professor in 2007 and has been a Distinguished Professor since 2009. He has authored or co-authored seven book chapters, more than 75 journal papers (including 42 IEEE journal papers), and more than 85 conference papers. His current research interests include computational intelligence (CI), field-programmable-gate-array chip design of CI techniques, intelligent control, computer vision, speech signal processing, and evolutionary robots. Dr. Juang is an Associate Editor of the IEEE T RANSACTIONS ON F UZZY S YSTEMS and the Internal Journal of Fuzzy Systems and an Editor of the Journal of Information Science and Engineering and the International Journal of Computational Intelligence in Control. He was Program Chair of the International Conference on Fuzzy Theory and Its Applications in 2012.

Chi-You Chen received the B.S. degree in electrical engineering from National Chung-Hsing University, Taichung, Taiwan, in 2009, where he is currently pursuing the M.S. degree. His current research interests include type-2 fuzzy systems, interpretable fuzzy systems, and neural fuzzy chips.

An interval type-2 neural fuzzy chip with on-chip incremental learning ability for time-varying data sequence prediction and system control.

This paper proposes a new circuit to implement a Mamdani-type interval type-2 neural fuzzy chip with on-chip incremental learning ability (IT2NFC-OL) ...
899KB Sizes 0 Downloads 3 Views