Fuzzy neural trees

Share Embed


Descrição do Produto

Intelligent Systems ~ .

HOLLAND

Fuzzy Neural Trees LES M. S Z T A N D E R A

Computer Science Department, Philadelphia College of Textiles and Science, Philadelphia, PA 19144

ABSTRACT The paper introduces fuzzy neural trees and an approach for converting these trees into feedforward neural network architectures. The proposed approach is unique in that it introduces the ways to use either technology as a "tool" within the framework of a model based on the other. It is of the highest significance as it results in new neural network algorithms where no time-consuming iterative training is required.

1. INTRODUCTION There has been, in the last several years, a large and energetic upswing in neuroengineering research efforts aimed at developing new dynamically generated neural network architectures. Part of those efforts was based on the attempt of synthesizing fuzzy logic with computational neural networks. Enormous success of commercial applications which are at least partially dependent on fuzzy technologies, mainly by the Japanese companies, has led to a surge of curiosity about the utility of fuzzy logic for scientific, engineering, and neuroengineering applications. The marriage of fuzzy logic with computational neural networks has a sound technical basis, because these two approaches generally attack the design of "intelligent" systems from quite different angles. Neural networks are essentially low-level, computational algorithms that offer good performance in dealing with sensor data used in pattern recognition and control. Fuzzy logic was introduced in 1965 as a means for representing, manipulating, and utilizing data and information that possess nonstatistical uncertainty. Thus, fuzzy methods often deal with issues such as reasoning on a higher, that is, semantic or linguistic level than neural networks. Consequently, the two technologies often complement each other: neural INFORMATION SCIENCES 90, 157-177 (1996) © Elsevier Science Inc. 1996 655 Avenue of the Americas, New York, NY 10010

0020-0255/96/$15.00 SSD! 0020-0255(95)00242-1

158

L. M. SZTANDERA

networks supply the brute force necessary to accommodate and interpret large amounts of sensor data and fuzzy logic provides a structural framework that utilizes and exploits these low-level results. We have the ways to use either technology as a "tool" within the framework of a model based on the other. As the neural network is well known for its ability to represent functions, and the basis of every fuzzy model is the membership function, so the natural application of neural networks in fuzzy models has emerged to provide good approximations to the membership functions that are essential to the success of the fuzzy approach. This paper introduces fuzzy neural trees, thus concerning itself with the integration of fuzzy logic with computational neural networks as a means for searching for novel dynamically generated neural network architectures. Therefore, an algorithm which can be used for creation and manipulation of fuzzy membership functions which, before, have been learned by the neural network from the data, is designed and implemented. In the opposite direction we are able to use fuzzy tree architecture to construct neural networks and take advantage of the learning capability of neural networks to manipulate those membership functions for classification and image recognition processes. The algorithm shows a way in which neural network technology can be used as a "tool" within the framework of a fuzzy set theory. Generating membership functions with the aid of a neural network has been shown to be an extremely powerful and promising technology. Fuzzy neural trees show a way towards combining two approaches. In the paper, they are used for a self-generation of feedforward network architectures suited for a particular problem. 2.

FUZZY NEURAL TREES

In this section, we introduce the concept of a fuzzy neural tree and compare it with existing fuzzy tree approaches. A fuzzy neural tree is introduced here with the purpose of executing the ranking of fuzzy subsets at the hidden layer nodes of the neural network. The ranking among fuzzy alternatives was investigated by, among others, Sztandera [1], Yager [2], Murakami et al. [3], and Sztandera and Keller [4]. The proposed fuzzy neural tree has fuzzy subsets defined at each of its nodes. Since the decision tree is going to be generated by a neural network algorithm, and because it uses fuzzy sets instead of just a single grade of membership, we call it a fuzzy neural tree. Connections between the nodes have a "cost" function being equal to the weights of a neural network. The directional vector of a hyperplane, which divides decision regions, is taken as the weight vector of a node. A sample fuzzy neural tree is depicted in Figure 1.

FUZZY NEURAL TREES

159

\4

\

B y

Fig. 1. A sample fuzzy neural tree with fuzzy sets at its nodes.

In the literature of fuzzy systems, two general types of fuzzy trees have been introduced. We describe them here after Koczy [5]. Group one defines properties of the classical concept of the fuzzy tree framework, while group two considers the fuzzy tree as a representation of imprecise relation between items and uses the property of connectedness. Fuzzy edge tree is based on the following concept. Given a complete tree, G, a fuzzy tree T of G is a tree in which a fuzzy membership degree is attached to each edge of the tree. The degree t~e c [0,1] expresses the degree of an edge belonging to tree T. So, if ~e = 0, the edge does not belong to T; and if k~== 1, the edge belongs fully to T. Figure 2 shows an example. This model is representative of group two. A fuzzy vertex tree proposed by Rosenfeld [6] also represents group two, but is based on another concept. If we are given an arbitrary (crisp) tree, G, a fuzzy tree T of G is a tree in which a fuzzy membership degree is assigned to each node of the tree. The degree /~ ~ [0,1], in this case, represents the degree obtained by an aggregation operation performed on membership degrees of the nodes connected. An example is shown in Figure 3.

160

L. M. SZTANDERA

Fig. 2. A fuzzyedge tree.

A representative of group one, the so-called fuzzy vertex and edge tree, allows the attachment of fuzzy membership degrees both to the nodes and edges. In this model, the resulting degree, representing the membership of an edge, is always a function of its own degree and the two degrees attached to the nodes connected to that edge. It is obtained by performing an aggregation operation on all three degrees. Figure 4 illustrates the concept. 3.

FEEDFORWARD NEURAL NETWORK ARCHITECTURES

The most important questions in determining a feedforward neural network architecture are how to calculate the number of nodes in hidden layers, and how to calculate the number of hidden layers. Lippmann [7] argued that a network with two hidden layers could solve an arbitrary classification problem. Irie and Miyake [8] showed that a one hidden layer back propagation network with an infinite number of nodes in the hidden layer could also solve arbitrary mapping problems. However, these results

0.3

0.4~

W

0.5 0.9

0.7

U

Fig. 3. A fuzzyvertex tree.

0.9

F U Z Z Y NEURAL TREES

161

0.6

/ /

= @ {0.8, 0.7, 0.6}

\ \

O

Fig. 4. A fuzzy vertex and edge tree.

have little practical value. Lippmann [7] also argued that the nodes in a hidden layer corresponded to separate decision regions into which training examples were mapped. Kung and Hwang [9] used the algebraic projection approach to specify how each node should be created. In those approaches, however, we have to know the properties of the training data, such as decision regions or pattern properties; thus their applications are limited. The determination of feedforward neural network architectures has been an important area in neuroengineering research. Practical approaches for dynamic neural network architecture generation have been sought by Sirat and Nadal [10] and Bichsel and Seitz [11]. In those architectures, at the end of a training process, all training examples are recognized and a neural network architecture is generated. In particular, the "tiling" algorithm of Sirat and Nadal [10] generates a feedforward network architecture by adding nodes and layers in a sequential manner. However, the algorithm does not give us the exact sequence in which a node is added to achieve the optimal classification of training examples. A similar algorithm of Bichsel and Seitz [11] uses information entropy to determine generation of nodes and hidden layers. Another algorithm of a special interest, the ID3 algorithm of Quinlan [12], dynamically generates a decision tree using information entropy functions. Studies by Dietterich et al. [13] and Fisher and Mckusick [14] revealed strong evidence that information entropy could be used as a criterion for determining the number of hidden layers in feedforward neural network architectures. Other authors, Cios and Liu [15] and Fahlman and Labiere [16], also addressed the problem of a dynamic generation of neural network architectures. The results presented here are based on outcomes obtained so far by Sirat and Nadal [10], Bichsel and Seitz [11], Sztandera and Cios [17-19], and Sztandera [1]. The proposed fuzzy neural trees generate nodes and hidden layers until a learning task is accomplished. The algorithms

162

L.M. SZTANDERA

operate on continuous data and equate a decision tree with a hidden layer of a neural network. A learning strategy used in this approach is based on minimization of the entropy function. This minimization of entropy translates into adding new nodes to the network until the entropy is reduced to zero. When the entropy is zero, then all training examples are regarded as correctly recognized. 4. T H E A L G O R I T H M F O R C O N V E R T I N G F U Z Z Y N E U R A L T R E E S INTO F E E D F O R W A R D N E U R A L NETWORK ARCHITECTURES Let us repeat here the basic notation used in our preliminary studies [17]. There are N training examples, N + examples belonging to class " + ," and N examples belonging to class " - . " A hyperplane would divide the examples into two groups: those lying on the positive (1) and negative (0) sides of it. Thus, we would have four possible outcomes: N1+ - - n u m b e r of examples from class " + " on the side 1, N~ - - n u m b e r of examples from class " + " on the side 0, N~- - - n u m b e r of examples from class " - "

(1)

on the side 1,

N 0 - - n u m b e r of examples from class " - " on the side 0. Let us assume that a certain level of a decision tree, N r examples are divided by a node r into N + belonging to class " + ," and N r belonging to class " - . " The values N~ and Nlr could be calculated as follows:

Ur N ; = E Di outi'

(2)

i=1

N1-r = Y'~ ( 1 - D i ) o u t i,

(3)

i=l

where D i stands for the desired output, and out i is a sigmoid function. Thus, we have

N~r + N~r =out~ +

... +OUtN,

I (

= Y'~out = ~.. 1 + e x p - Y'~wijx j • t

j

)]1

(4)

F U Z Z Y N E U R A L TREES

163

The change in the number of examples, on both the positive and negative side of a hyperplane, with respect to the weights [17], is given by N,

AN~, = Y'~ D i outi(1 -outi) Y'.xj Awij, i=1

(5)

j

Nr (6)

AN(r = E (1 -Di)outi(1 -outi) ~.,xj Awij. i=1

j

The learning rule to minimize the fuzzy entropy f(F) [17] is

AWij=--l}

Of(F)

(7)

(~Wij--,

where p is a learning rate, and f(F) is a fuzzy entropy function. The grades of membership for fuzzy sets F and F c to be used in calculation of f(F) are defined as follows:

F=

Nor No; Nt, N(r } Nor, Nor ,Nlr , N, r ,

(8)

Fc= I - F .

(9)

If we use the mutual dependence of positive and negative examples on both sides of a hyperplane, then, taking into account that NI,=N~r +N~r and Nor = N~r + Nor, the resulting fuzzy set F and its complement F c are defined as

F= Fc

f

f,.- - f i r

N 7 - gl+r

f/ gr - Nlr - Nr+

=~Nr_-N~N~r, N_N~_N1" ,

f,,'

I'

Sir - Sir Sir - g?r } N1 r , N, r .

(10)

(11)

The four grades of membership [(10) and (11)] will be used in Dombi's operations [17], and in calculation of the fuzzy entropy [17]. The obtained fuzzy entropy will be used to calculate the weights using learning rule (7).

164

L.M. SZTANDERA

In order to increase the chance of finding the global minimum, the learning rule is also combined with Cauchy training [20] in the same manner as suggested in [17]: (12)

Wk+l = Wk(1 - if) AW+ ~'AW~and. . . .

where ~" is a control parameter. The grades of membership for the fuzzy subsets A and B are initially defined for only two points "ml" and "m2" from which we shall construct the two fuzzy subsets (Figure 5). The grades of membership for fuzzy sets A and B at points "ml" and "m2" are defined as follows:

U,;r txA( m , )

=

Nilr

and

/xA(m2)- Nor

,

N;;

N~

i . z n ( m l ) - N1 r ,

I'zB(m2)- N1" •

(13)

Taking into account that N 1r = N(r + Nit and Nor = N~r + Nor results in the following expressions:

Nor

N;

~ A ( m l ) = N(~r + N ~ r - N _ N ~ r - N l r

'

u (x) A

B \

/

:/

"\

\

/

\, \

\

\

/

\

/:

/ j:/

0 Fig. 5.

\,

z

n,'l I

/:

.\

\

/, .\

/ m2

ml + m2

Fuzzy subsets generated by the algorithm according to formula (15).

x

FUZZY NEURAL TREES

165

Ndr

/zA(m2)- N~r +Nor N(r N(r+N(r,

tzs(m,)-

Nr -Nlr -- Nr__ N(r - Nlr ' N~+r /zs(mz) - Nl+r + N l r .

(14)

Now, we define membership grades for the fuzzy sets A and B from the following functions: I XlXA(ml)

for

ml

tXA(X ) =

tzA(m2)(x--ml) + lxA(m,)(m z-x) m 2 --m 1

,

for m~ 2 . The concepts introduced before will used; however, the final nodes of the neural fuzzy tree will be associated now with two of the C classes (instead of one of two classes). We propose a new strategy based on adequate dichotomization of the classes. We have already launched research aimed at comparing existing algorithms with our proposed scheme. The first one of the existing algorithms is called the " K - d tree" [21]. A competition between the K sets of patterns is introduced there, and each pattern is a vector in a K-dimensional space. The quality of a separation is measured by a criterion which is a generalization of a Shannon entropy for the case of more than two classes. For each index i ( i = 1..... K), the algorithm looks for the best separator (hyperplane) orthogonal to axis i. This gives K dichotomies, and the dichotomy which has the highest score is selected. It is desired, however, to find separators that are not parallel to any axis. This is exactly what could be achieved by invoking our algorithm. In the second algorithm [22], the architecture is a feedforward network where the number of neurons is fixed a priori to log2(p), where p is the number of patterns. There, a decision tree was used to determine the outputs associated with each pattern. For each dichotomy, the division (left/right) was done via Cauchy training. This scheme could be adapted to any multiclass problem in which the number of neurons is taken as log2(C). However, if a category contains highly inhomogeneous patterns, like lower- and uppercase letters, belonging to the same class, then the existence of such subclasses must be known a priori. So, the number of neurons would be log2(2C). Obviously, except for simple cases, the subclasses are not usually known. The advantage of our approach, described in the next subsection as a class separation method, is that the number of neurons will not be fixed in advance but will be dynamically generated by the algorithm. In the third algorithm, called "class competitive method" [10], there is a competition between the C different possible binary classifications. This is done by running C algorithms, each trying to separate patterns of class c (c = 1..... C) from all the other patterns by comparing this class (c) with each of the remaining ( c - 1) classes. In this algorithm, the entropy measure is associated with the class under consideration. After conver-

F U Z Z Y N E U R A L TREES

167

gence, the couplings given by the algorithm with the best score are taken. However, if the number of classes is large (C > 10), this approach might lead to prohibitive calculation times. The fourth algorithm, a dichotomy of classes method, is based on a principal axis projection [10]. The classes are ordered on a one-dimensional lattice by projecting their gravity centers onto the first principal axis of the training pattern distribution for all classes. This divides the classes into two sets, negative or positive projection, where the global center of mass is being projected on zero. This division guarantees that the separator, found by the tree growing algorithm, would be balanced, that is, the numbers of patterns in the two half-spaces are approximately equal. That strategy is useful only if the projections of gravity centers of the classes under consideration on the first principal axis do not coincide. Even then the classification error seems to be rather high: 20-28.5% [10].

4.1.1. A Class Separation Method Our approach, which would circumvent the above shortcomings, is to design an algorithm which runs ( C - 1) ranking subroutines, each trying to achieve ranking indices specified by Definition 1, thus separating patterns of class c from all other patterns, and then repeating the procedure until c equals C - 1 . This method is utilized in our hybrid algorithm. It is somehow similar to the third algorithm, but here the fuzzy sets would not be formed for each class. Instead, they would be formed and ranked for class c (the first fuzzy set) and for all ( c - 1) classes combined together (the second fuzzy set). The outline of the algorithm follows. The algorithm consists of five steps. Step 1 divides the input space into several subspaces; Step 2 counts the number of samples in those subspaces; Step 3 generates membership functions for fuzzy subsets out of those numbers; Step 4 executes ranking of formed fuzzy subsets; Step 5 determines separation of categories based on faithful ranking.

Step 1--Divide the Input Space into Seueral Subspaces Make use of the learning rule

ef ( F ) AWij = -- p •Wi-----7

168

L. M. S Z T A N D E R A

and search for a hyperplane that minimizes the entropy function: m i n f ( F ) = ~R ~Nr- e n t r o p y ( L r ) , r=l

where L is a level of a decision tree, R is the total number of nodes in a layer, r is the number of nodes, and f(F) is the entropy or fuzzy entropy function.

Step 2--Count the Number of Samples in Resulting Subspaces The first class consists of patterns belonging to class c, and the other class consists of all other patterns. Start with random initial vector W0.

Step 3--Generate Membership Functions for Fuzzy Subsets Generate membership functions for fuzzy subsets while creating nodes in hidden layers.

Step 4--Execute Ranking of the Formed Fuzzy Subsets The ranking is executed according to the Yager F1 index [2] with

g(u) = u or the Murakami et al. [3] centroidal method for Xo:

or

fl u Xo =

clu

uA,(u) du

Step 5--Determine Separation of Categories Based on Faithful Ranking The ranking of fuzzy subsets is faithful, that is, the data samples are fully separated if the specified values [1] for the ranking indices are established. If this is the case, then increment c by 1 and return to step 1. Continue until c = C - 1. If the ranking is unfaithful, add a new node into the current layer and go to Step 1.

FUZZY NEURAL TREES

169

Interested readers are referred to [1] for the derivation and more detailed presentation of the algorithm. In order to use the algorithm to approximate any real continuous function on a compact set, all the points used for learning the function are considered as class one, and the background (everything else) is considered as class two. Then, the algorithm produces outputs which are fuzzy sets themselves. The obtained fuzzy subsets will always be normal. Values corresponding to the grades of membership equal to one can be taken as the predicted values, or any defuzzification method can be used. 5.

EVALUATION OF THE ALGORITHM

As the proof of any new technology ultimately lies in its utility for solving complicated tasks, the algorithm was applied to solve highly nonlinearly separable problems. To check the performance of the algorithm, we applied it to twoclass data, Ishibuchi data set, and simple function approximation. Twoclass Data

The twoclass data set is an artificially generated normally distributed set of vectors. This data set was included because classification results from numerous fuzzy approaches were available to use in comparison, and that data set has been extensively used in fuzzy literature [23-26]. The data consist of 242 four-dimensional vectors with 121 vectors belonging to each class. The proposed fuzzy neural algorithm was able to perfectly learn all data samples. The resulting neural network architecture is shown in Figure 6. It is worth noting that if we force the fuzzy entropy algorithm to add hidden layers until the task becomes linearly separable, the resulting neural network architecture would consist of 23 hidden layers. The results obtained by using the fuzzy C-means algorithm [23] and the fuzzy K-nearest neighbor algorithm [24] are summarized in Tables 1 and 2 for comparison. Ishibuchi Data

The data set was proposed by Ishibuchi et al. [27] to investigate two back propagation algorithms based on possibility and necessity pattern classification, and is shown in Figure 7. Ishibuchi et al. pointed out that theoretically there was a neural network that could classify all the patterns

170

L. M. S Z T A N D E R A

Fig. 6. A neural network architecture for the twoclass data problem.

TABLE 1 Confusion Matrix for the Twoclass Data Using Fuzzy C-Means Algorithm Class number

1

2

1 2

114 15

7 106

171

F U Z Z Y N E U R A L TREES TABLE 2 Misclassified Samples for the Twoclass Data Using Fuzzy K-Nearest Neighbor Algorithm Class Number

1

2

1

113

8

2

12

109

correctly; however, obtaining such a network might be time consuming. Eventually, they did not go in that direction, but instead adopted possibility and necessity measures. Using both algorithms, they had about 20 misclassified patterns in each run of the algorithms. We decided to apply our algorithm to the above data set. It learned fuzzy membership functions from that data and executed ranking of the learned fuzzy subsets. In so doing, it added the nodes in hidden layers until ranking turned out to be faithful, that is, until all patterns were classified correctly. The resulting neural network architecture is depicted in Figure 8. The architecture consists of two hidden layers with 2 and 14 nodes, respectively.

i::"--

I

-C' -

C,

':-"

"2' C"

C'

C" C.' • : 2 :

• ' • ' •

'-"

C~ •

C~

• •

-""

C" "-'

C'

"-'

"£" •

C. . . . . . .



-"-'

':'

C'

':'

'-"



,-"







'"

*'"

""

'"

'"

""



C"





..

.

..

l

'qw

~

~

'qv

'qv

.

C'

Fig. 7. Ishibuchidata set [Ishibuchi et al., 1992].

i::i1

172

L.M. SZTANDERA

Fig. 8. A neural network architecture, generated by our algorithm, that solves the Ishibuchi problem [Ishibuchi et al., 1992].

Approximation Problem The success of neural networks is explained by the universal approximation property via the Stone-Weierstrass theorem [28, 29]. However, there has not been a standard method, until now, for determining the number of neural units in hidden layers. If we want to use the hybrid algorithm for function approximation, then all the points used for learning the function are considered as class one, and the background, noise, etc. are considered as other classes. The algorithm in this case produces outputs which are fuzzy sets themselves. The obtained fuzzy subsets have grades of m e m b e r ship 1.0 at predicted values and thin out as we move to the left and to the right from their heights. Grades of membership for those fuzzy sets fall to zero as other classes are encountered. To show approximation properties of our algorithm, it was tested to approximate the function shown in

173

FUZZY NEURAL TREES 4.5

,

,

4

/

3.5

3

2.5

2

Y

1.5

1

0.5

0

0

I

!

|

I

|

l

I

I

0.5

1

1.,5

2

2.5

3

3.5

4

4.5

Fig. 9. A function approximation problem.

Figure 9. The algorithm was trained on 91 out of 95 randomly chosen points. The test consisted of predicting the missing four points, thus verifying the approximation property of the algorithm. Several different tests were run. The architecture of the algorithm for one of the tests is depicted in Figure 10. The output fuzzy subsets corresponding to expected crisp values of 0.37, t.61, 2.17, and 3.39 are shown in Figure 11. The obtained fuzzy subsets have grade of membership 1.0 at those values. They may be defuzzified using one of the defuzzification methods, or by simply taking the value which corresponds to their heights. It is understood that the simple approximation problem presented here does not fully address a neural network generalization property. However, this was our first attempt to tackle the generalization issue. A real-life medical problem of applying the algorithm to prediction of obstructions in coronary artery stenosis, by using 75% of the data for training, is currently under investigation. The preliminary results are promising.

174

L.M. SZTANDERA

Fig. 10. A neural network architecture that solves the function approximation problem.

6.

CONCLUSIONS

In this paper, a general method of generating fuzzy neural trees from numerical data and converting them into feedforward neural network architectures was outlined. The algorithm shows a way in which neural network technology can be used as a "tool" within the framework of a fuzzy set theory. Generating membership functions with the aid of a neural network has been shown to be an extremely powerful and promising technology. Our hybrid fuzzy neural algorithm is a building block towards combining the two soft computing paradigms. It allows for a self-generation of a feedforward network architecture suited for a particular problem. The approach is different from others proposed in the literature [30-37]. The main features and advantages of the outline method presented in this paper are (i) it provides us with a general method to combine measured numerical information and fuzzy set theory into a common framework; (ii) it is a simple and straightforward quick-pass build-up

FUZZY

NEURAL

TREES

175

1.0

h

0.0

0.1 0.185

0.37 0.435

0.6

1.0

I

1.485

1.61

~

1.715

t

J

1.8

1.0

I

2.0

I

/

2.13

/

It

I

2.17 2.19

L

I

I

2.3

1.0

3.0 Fig. 11.

3.1

3.24

3.39 3.445

Output fuzzy sets generated in the function approximation problem.

176

L.M.

SZTANDERA

p r o c e d u r e , where no t i m e - c o n s u m i n g iterative t r a i n i n g is required, resulting in m u c h shorter design time than most n e u r a l networks.

REFERENCES 1. L. M. Sztandera, Dynamically generated neural network architectures, J. Artif. Neural Syst. 1:41-66 (1994). 2. R. R. Yager, On chosing between fuzzy subsets, Kybernetes 9:151-154 (1980). 3. S. Murakami, H. Maeda, and S. Immamura, Fuzzy decision analysis on the development of centralized regional energy control system, in Preprints oflFAC Conference on Fuzzy Information, Knowledge Representation and Decision Analysis, 1983, pp. 353-358. 4. L. M. Sztandera and J. M. Keller, Spatial relations among fuzzy subsets of an image, in B. M. Ayyub, Ed., Uncertainty Modeling and Analysis, IEEE Computer Society Press, College Park, MD, 1990, pp. 207-211. 5. L. T. Koczy, Fuzzy graphs in the evaluation and optimization of networks, Fuzzy Sets" Syst. 46:307-319 (1992). 6. A. Rosenfeld, Fuzzy graphs, in: L. A. Zadeh, K. S. Fu, K. Tanaka, and M. Shimura, Eds., Fuzzy Sets and Their Applications to Cognitive and Decision Processes, Academic, New York, 1975, pp. 77-97. 7. P. Lippmann, An introduction to computing with neural nets, 1EEE Trans. Acoust. Speech Signal Proc. 44:4-22 (1987). 8. B. Irie and S. Miyake, Capabilities of three-layered perceptrons, in Proc. IEEE Int. Conf. on Neural Networks', 1988, pp. 641-648. 9. S. Y. Kung and J. N. Hwang, An algebraic projection analysis for optimal hidden units size and learning rates in back-propagation learning, in Proc. IEEE Int. Conf. on Neural Networks, 1988, pp. 363-370. 10. J. A. Sirat and J. P. Nadal, Neural trees: A new tool for classification, Network 1:423-438 (1990). 11. M. Bichsel and P. Seitz, Minimum class entropy: A maximum information approach to layered networks, Neural Networks 2:133-141 (1989). 12. J. R. Quinlan, Induction of decision trees, Machine Learning 1:81-106 (1986). 13. T. G. Dietterich, H. Hild, and G. Bakiri, A comparative study of ID3 and backpropagation for English text-to-speech mapping, in Proc. 7th Int. Conf. on Machine Learning, Texas, 1990. 14. D. H. Fisher and K. B. Mckusick, An empirical comparison of ID3 and backpropagation, in Proc. llth Int. Joint Conf. on Artificial Intelligence, 1989, pp. 788-793. 15. K. J. Cios and N. Liu, A machine learning method for generation of a neural network architecture: A continuous ID3 algorithm, IEEE Trans. Neural Networks 2:280-291 (1992). 16. S. E. Fahlman and C. Labiere, The cascade-correlation learning architecture, in D. S. Touretzky, Ed., Advances in Neural Information Processing Systems 2, Morgan Kaufmann, Los Altos, 1990, pp. 524-532. 17. L. M. Sztandera and K. J. Cios, Continuous ID3 algorithm with fuzzy entropy measures, in Proc. 1st Int. Conf. on Neural Networks and Fuzzy Systems, San Diego, 1992, pp. 469-476.

FUZZY NEURAL TREES

177

18. L. M. Sztandera and K. J. Cios, Decision making in a fuzzy environment generated by a neural network architecture, in Proc. 5th IFSA World Congress, Seoul, 1993, pp. 73-76. 19. L. M. Sztandera and K. J. Cios, Ranking fuzzy subsets at nodes of a neural network algorithm, in Proc. 9th Int. Conf. on Systems Engineering, Las Vegas, 1993, pp. 587-591. 20. H. Szu and R. Hartley, Fast simulated annealing, Phys. Lett. A 122(8):157-162 (1987). 21. S. M. Omohundro, Efficient algorithm with neural network behavior, Complex Syst. 5:348-350 (1987). 22. H. J. Schmitz, G. Poppel, F. Wunsch, and U. Krey, Fast recognition of real objects by an optimized hetero-associative neural network, J. Phys. 51:167-183 (1990). 23. J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum, New York, 1981. 24. J. M. Keller, M. R. Gray, and J. A. Givens, Jr., A fuzzy K-nearest neighbor algorithm, IEEE Trans. Syst. Man Cyber. SMC-15(4):580-585 (1985). 25. J. M. Keller and D. J. Hunt, Incorporating fuzzy membership functions into the perceptron algorithm, IEEE Trans. Pattern Anal. Machine Intell. PAMI-7(6):693-699 (1985). 26. J. M. Keller and B. Yan, Possibility expectation and its decision making algorithm, Proc. 1st Int. Conf. on Fuzzy Systems and Neural Networks, San Diego, 1992, pp. 661-668. 27. H. Ishibuchi, R. Fujioka, and H. Tanaka, Possibility and necessity pattern classification using neural networks, Fuzzy Sets Syst. 48:331-340 (1992). 28. K. Hornik, M. Stinchcombe, and H. White, Multilayer feedforward networks are universal approximators, Neural Networks 2:359-366 (1989). 29. K. Hornik, Approximation capabilities of multilayer feedforward networks, Neural Networks 4:251-257 (1991). 30. M. M. Gupta and J. Qi, On fuzzy neuron models, in L. A. Zadeh and J. Kacprzyk, Eds., Fuzzy Logic for the Management of Uncertainty, John Wiley & Sons, Inc., New York, 1992, pp. 479-491. 31. K. Hirota and W. Pedrycz, Fuzzy logic neural networks: Design and computations, in Proc. IJCNN'91 (2), Singapore, 1991, pp. 1588-1593. 32. H. Ishibuchi, R. Fujioka, and H. Tanaka, Neural networks that learn from fuzzy if-then rules, 1EEE Trans. Fuzzy Syst. 1(2):85-97 (1993). 33. H. Ishibuchi, K. Kwon, and H. Tanaka, Implementation of fuzzy if-then rules by fuzzy neural networks with fuzzy weights, in Proc. 1st Eur. Congr. on Fuzzy and Intelligent Technologies, Aachen, 1993. 34. J. M. Keller, R. R. Yager, and H. Tahani, Neural network implementation of fuzzy logic, Fuzzy Sets Syst. 45:1-12 (1992). 35. J. M. Keller and H. Tahani, Backpropagation neural networks for fuzzy logic, Inform. Sci. 62:205-221 (1992). 36. M. Figueiredo, F. Gomide, and W. Pedrycz, A fuzzy neural network: Structure and learning, in Proc. 5th IFSA World Congress, Seoul, 1993, pp. 1171-1174. 37. F. Yuan, L. A. Feldkamp, L. I. Davis, Jr., and G. V. Puskorius, Training a hybrid neural-fuzzy system, in Proc. IJCNN Int. Joint Conf. on Neural Networks, Baltimore, 1992, pp. 739-744.

Received 11 May 1994; revised 27July 1995

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.