Neural network ensemble with temperature control

August 24, 2017 | Autor: Prayogi Hartono | Categoria: Physics, Neural Networks, Multilayer Perceptron
Share Embed


Descrição do Produto

Neural Network Ensemble with Temperature Control Pitoyo Hartono Shuji Hashimoto Department of Pure and Applied Physics Graduate School of Science and Engineering Waseda University { hartono,shuji}@ shalab.phys.waseda.ac.jp

-

II.Ensemble's Configuration

Abstract In this papex we propose a model of

neural network ensemble composed of a number of multi layer perceptrons (MLP), each with a unique expeatise. Using temperature control the most appropriate ensemble member will be automatically activated for a given environment while the irrelevant members will be inhibited. The proposed temperature control will enable the neural network ensemble to work efficiently in multiple environments.

-Ell

L Introduction Recently a lot of works have been dedicated toward the realization of neural network ensemble 1*1121[31. The neural network ensemble model is usually composed from a number of independent neural networks that solve a given problem simultaneously. This structure can be equalized with a single network that is used for multiple trids141. The common goal of those researches is to achieve a performance reliability that can not be achieved by a single neural network. This goal can be achieved by statistically evaluating all of the members' output", or taking majority decision from all of the outputs. Although neural network ensemble usually shows better @ormance compared to the conventional single neural network, it still can only perform in a single environment (we defined the environment here as a set of input patterns and

their corresponding desired outputs). In this study we propose a model of neural network ensemble which have a mechanism to automatically activate its most appropriate member for a given environment, so that the ensemble can efficiently run in multiple environments. The selection of the most relevant member can be done by applying competition to the members that will eventually produce a dominant member. Because of the simplicity of the proposed mechanism, the goal of achieving a neural network ensemble that run effectively in multiple environment can be realized without adding too much calculation complexity compared with the conventional neural network.

0-7803-5529-6/99/$10.00 01999 IEEE

As shown in Egurel, the ensemble consists of a number of MLPs that run independently. Each MLP is required to have the same number of input neurons and output neurons, while the number of middle neurons may vary one from another. The ensemble also has an input layer, that receives input from environments and an output layer which processes outputs of the members. Input coming to ensemble's input layer are propagated to the corresponding input units of all MLPs. Each MLP will process the given inputs according to its expertise and produce an output. All of the outputs of the M L P s are connected to the corresponding units in the ensemble output layer with weights fixed to 1 (the k e d weights are shown with broken lines in figurel). There are no restrictions on MLp's structure but in this study we used 3 layered-perceptron.

4073

III Ensemble's Dynamic In the proposed model the neuron's activation function is defined as follows: For neurons in the middle layer: ..

Where ojf'd and u p d are the output and potential of the m-th neuron in middle layer of the i-thMLP respectively. oik is the output of the nth neuron in the input layer , w g i s the weight between the n-th neuron in the input layer and the m-th neuron in the middle layer of the i-th MLP, and Ni,indicates the number of input neurons in the i-th MLP. For neurons in the output layer:

Where o j p &d u j p are the output and potential of the m-th neuron in output layer of the i-th MLP respectively. o y " is the output of the n-th neuron in the middle layer, wwdis the weight between the n-th neuron in the middle layer and the m-th neuron in the output layer inside the i-th W ,and Nd indicates the number of middle neurons in the i-th MLP. From equation (2) it is clear that, if temperature T is large enough, the neuron will always produce a value in the vicinity of 0.5, regardless of its potential value. For a problem requiring (41) binary output, 0.5 can be thought to be an insignificant value. A neuron that always gives a 0.5 output is a hesitant neurcms which have no

importance, we call such kind of neuron an inactive neuron. An ensemble member that have inactive neurons is called an inactive member. In our proposed ensemble model, MLPs that are irrelevant to the given environment, can be deactivated by gradually increasing its temperature, while a MLP that is relevant is permitted to run actively. Considering that eventually only one MLP is permitted to run actively, the corresponding ensemble's k-th output o p is defined as: OZm =

N

2

O p t - OS(N - 1)

(3)

N indicates the number of MLP in the ensemble. From equation (3) it can be seen that the output of the ensemble will be equal with the output of the dominant MLP inside the ensemble. Furthermore, each I " is permitted to continuously correcting its connection weights while running in the given environment according to the back propagation learning rulen shown bellow:

Where wi is weight vector of the i-th MLP. E, t, and olPUt are the error, the teacher signal, and the output of the i-th MLP respectively , while Tl indicates the learning rate. For the weights to the output units, the correction value can be calculated as,

w p d indicates the weight from the j-th neuron in $e middle layer to the k-th neuron in the output layer of i-th MLP. Equation (5) shows that for a large T the correction connection weights between middle neurons and output neurons can be ignored. And because in the back propagation method the correction of connection weights between input neurons and middle neurons can be written as:

This will also be near zero provided that the temperature T is large enough. Equation (5) and (6) imply that a MLP with large temperature, will not only become inactive by always producing insignificant outputs, but also become insensitive in the learning process, because its connection weights are always renewed with near 0 value. This means that although each MLP in the ensemble is permitted to continuously correcting its connection weights, the learning process will actually have little effect when the MLP is nmning in a particular environment outside its expertise, so the [email protected] of a MLP running on irrelevant environment will be preserved. On the contrary MLPs with low temperature, will adapt the environment by correcting its connections weights.

IV.Temperature Control The idea of temperature control proposed in

this study, is to penalize MLPs that performed badly by increasing their temperatures, and to

reward MLPs that perform well by decreasing their temperatures. To ensure that eventually only one MLP dominates the ensemble, we apply competition among the ensemble's members. A

4074

member that performs relative well will not only decrease its own temperature, but it will also increase other members' temperatures, so that it can dominates others. On the contrary member that performs badly, will have to punish itself by increasing its temperature, and decreasing the temperatures of the rest of the members to pass the power of dominance to others. The most relevant MLP with respect to the given environment will come as a winner of this competition and will dominate the ensemble until the environment is changed. The tempexature control can be written as follow: T,(f

another MLP that is more relevant to the given environment. If the time needed for activating a relevant MLP is sufficiently short compared to the time needed for training the MLF' from initial state, then the proposed activating mechanism can be considered to be effective.

V. Experiments 1.Experiment 1 In the computer experiments, an ensamble consist of 3 3layered-MLP is used. Their structure, which are expressed by triplets, indicating the number of input, middle and output neurons respectively, are (2,6,1), (2,3,1) and (2,7,1) respectively. 2 boolean functions with 2 inputs and an output (ORNOR) are presented to the ensemble alternatively 3 times as periodically changing environments. Each environment is given to the ensemble l0,OOO trialdcycle. The parameters are set according to Table 1.

+ 1) = T,(f) + A q r ) N

AT(f) = - p"f(1 - Nd(f))

+ p c m s x (1 - NE'(f)) PI

Where N is the number of MLPs in the ensemble, p&, p"" ,a, jj are positive constants. E is error rate of i-th MLP, indicating the ratio of i-th MLP's error with respect to the members' total error for a given input requiring output t. The first term of temperature renewal function AT is self penalty term,which will decrease the temperature if the MLP performed relatively well and increase the temperature otherwise. The second factor is cross penalty term from other MLPs. The third factor is a "ml down" tam, which is relevant for inactive MLPs, because the factor always tries to activate MLP with 0.5 output. This factor is mostly effective in the transieat period of environment change, because it helps to speed up the activation of inactive MLF's so that they can take part in the temperature competition with relatively active MLPs. To speed up temperature competition further, we may add a momentum term, so that the temperature renewal can be written as follow: T(t+l) = T ( f ) +AT(t) +yM'(f-l)

Y

Tablel. Parameters Setting The initial connection weights are set randomly for each MLP. The performance of the ensemble is shown in Figure 2. Low MSE value indicates good adaptability of the ensemble with respect to the given environments.

(8)

The temperature value is l i i t e d between 1 and 75. From equation (7)and (8), when a particular MLP performed extremely well, it will stay as dominant member because it constantly rewarding itself by decreasing its own temperature while penalizing others by increasing their temperature so that they can be kept inactive until the environment is changed. When this OCCUTS, the performance of the dominant MLP will become poor and eventually it will lose the temperature competition and surrenders its dominance to

time x10' Figure 2 Ensemble's Performance

As can be seen from Figure 2,in the first cycle, MLP winning the temperature competition learns to adapt to the given environment fiom its initial conditions, because it has no prior knowledge about the environment. From the second cycle,

4075

the ensemble only has to activate a particular MLP that have learned successfully to run in that environment in the previous cycle, without having to leam fiom the initial state. It is obvious that

from the second cycle the ensemble is able to adapt to the environment rapidly.

Figure3 Temperature Fluctuation

Figure 3 and Figure 4 show the temperature fluctuation and MLPs' activities respectively. The activity A' of i-th MLP is defined as follow: (0.5 - O'@")*

(9)

A' indicates the contribution of i-th MLP in the ensemble's output. Greater activity value indicates ,greater dominance.

From figure 3 and Figure 4 it is clear that MLPl becomes dominant when the environment is OR, and MLP;! dominates when the environment is

4076

NOR while MLp3 stays still for both environments. In the fist cycle, hard competition among MLPs occurs, but from the second cycle the competition become relatively soft, enabling the ensemble to adapt the environment change rapidly. 2. Experiment 2

In this experiment, unlike the experiment 1 we

use 2 environments that have some correlation among them, that means that they have identical problem-answer pairs.. This general environment behavior will make the competition for dominance more difficult than that of the experiment 1 where the environments do not share common behavior. We used 3 3layeredperceptron of which structures are (2,2,1),(2,3,1),(2,6,1)respectively to run in 3 cycled (XORAND) environments. Parameter setting is identical with that of experiment 1.

0.8 0.6

90.4 0.2 0

1

2

3

4

5

time x lo4

Figure 5 Ensemble’s Performance

100 A’

50 0 100

83

B

A2 50

-4

.h U

9

0 100 A’

50 0 1 2 3 4 Figure 6 Members’ Activities

From Figure 6, it can be seen that although there are severe competition between members, especially in the iirst cycle, eventually the ensemble will activate MLP3 when the environment is XOR and switches to MLp2 when AND is the environment. From the second cycle, the ensemble can rapidly adapt to the environmeat change. This experiment shows that although the

5

timexi04

environments share common behavior among them, the ensemble will be able to choose the most appropriate member to deal with the environments

4077

VL Conclusion We have proposed a novel neural network ensemble model that is able to run efficiently in multiple environments by activating the most relevant ensemble member by introducing temperature control. This ability is previously unachieved by the conventional single MLP nor by formerly proposed ensemble models. We are considering to analyze the efficiency of the proposed method theontically, to apply to the more complex problems in the practical fields.

References

[I] Lars Kai Hansen: Neural Network Ensemble, IEEE Trans on Pattern Analysis and Machine htelligence(l990), vol 12no lO,pp.993-1000

[2]Michael Perrone,: When Networks Disagree: Ensemble Methods for Hybrid Neural Network, Neural Network for Speech and Image Processing(l993),Chapman-Hall [3] Baxt,W.G Improving the accuracy of an artificial neural network using multiple differently trained networks, Neural Computation (1992) vol 4 no 5,772-780 [4] Steve Lawrence et.al: On the Distribution of Performance h m Multiple Neural Network Trials.lEEETrans on Neural Networks( 1997)vol8 uo 6, pp.1507-1517 [5] Arthur Flexer: Statistical Evaluation of Neural Network Experiments, Roc of 13m European Meeting on cyberneticsand System Search( 1996) [SI Robert Jacobs, et.al: Adaptive Mixture of Local Expert, Neural Computation(1991), no 3,pp 79-87 [i'l D.E.Rumelhart et al: Learning Internal Representation by Error Propagation, parallel Distributed Processing(l986). voll, The MlT

Press.

4078

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.