Kak Neural Network Mehdi Soufifar Mehdi Hoseini Amir
- Slides: 69
Kak Neural Network Mehdi Soufifar: Mehdi Hoseini: Amir hosein Ahmadi: Soufifar@ce. sharif. edu Me_hosseini@ce. sharif. edu A_H_Ahmadi@yahoo. com
Corner Classification approach Corners For XOR Function: 0 1 1 0 2
Corner Classification approach… n Map n-dimensional binary vectors (input) into mdimensional binary vectors (as output) Mapping function (f) is: n Using…: n n n Backpropagation (does not quarantee convergence). … 3
Introduction n n Feedback (Hopfield with delta learning) and feedforward (backpropagation) networks learn patterns slowly: the network must adjust weights connecting links between input and output until it obtains the correct response to the training patterns. But biological learning is not a single process: some forms are very quick and others relatively slow. Short-term biological memory, in particular, works very quickly, so slow neural network models are not plausible candidates in this case 4
Training feedforward NN [1] n n Kak proposed CC 1, CC 2 in January 1993. Example: n Exclusive-OR mapping 5
Training feedforward NN [1] n n Kak proposed CC 1, CC 2 in January 1993. Example: n Exclusive-OR mapping 6
CC 1 as an example n n Initialize all weight with zero. If result is true do nothing. If result=1 and supervise say 0 subtract x vector from weight vector. If result=0 and supervise say 1 add x vector to weight vector. Input Layer Hidden Layer as corners X 1 01 Output Layer (OR Gate) 1 1 X 2 y 1 10 7
CC 1… n Result on first output corner: samples W 1 W 2 W 3 Init, 1 0 0 0 2 0 1 1 3 -1 1 0 8
CC 1… n Result on second output corner: samples W 1 W 2 W 3 Init, 1, 2 0 0 0 3 1 0 1 4, 1, 2 0 -1 0 3 1 -1 1 4, 1, 2 0 -2 0 3, 4 1 -2 1 1 1 -2 0 9
CC 1 Algorithm n Notations: n n n Mapping is Y=f(X), X, Y are n and m dimensional binary vectors. Therefore we have (i=1, …, k) (k=number of vectors). Weight of Vector: number of 1 element on it. If the k output sequence are written out in an array then the columns may be viewed as a sequence of m, k dimensional vectors. Weight of is. 10
CC 1 Algorithm… n n n Start with the random initial weight vector. If the neuron says no when it should say yes, add the input vector to the weight vector. If the neuron says yes when it should say no, subtract the input vector from the weight vector. Do nothing otherwise. Note that a main problem is “what’s the number neurons in the hidden layer? ” 11
Number of hidden neurons • Consider that: • And the number of hidden neurons can be reduced by the duplicating neurons equals to: 12
Number of hidden neurons… n n Theorem: The number of hidden neurons required to realize the mapping , i=1, 2, …, k is equal to: And since n we can say: The number of hidden neurons required to realize the mapping is at most k. 13
Real Applications problem [1] n Comparison Training results: Alg. On XOR problem Number of Iteration BP 6, 587 [1] CC (CC 1) 8 [1] 14
Proof of convergence [1] n We would establish that the classification algorithm converges if there is a weight vector such that for the corner that needs to be classified, and otherwise. Wt is the weight vector of t-th iteration Θ is the angle between and Wt n If neuron say no, when it must say yes: n n 15
Proof of convergence… n n Numerator on cosine becomes: produces correct result, we know that: And we get same inequality for the other type of misclassification( ). 16
Proof of convergence… n Repeating this process for t iteration produces: n For the cosine’s denominator( n If neuron says no we have n ): then: And same result will be obtained for other type of misclassification( ). 17
Proof of convergence… n Repeating substitution produces: n Since n Then we have: , we have: 18
Proof of convergence… n From (1), (2) we can say: 19
Types of memory n Long-term In AI like BP & RBF, … n Short-term Learn instantaneously with good generalization 20
Current network characteristics What the problem of BP and RBF n n n They require iterative training Take long time to learn Sometimes doesn’t converge Result n n They are not applicable in real-time application They could never learn short-term, instantaneously-learned memory (the most significant aspects of biological working memory ). 21
CC 2 algorithm n In this algorithm weight are given as follows: n n The value of implies that the threshold of hidden neurons to separate this sequence is. Ex: n 0 1 1 0 W 3 = -(si-1)=-(1 -1)=0 Result of CC 2 on last example is: -1 1 1 -1 22
Real Applications problem n Comparison Training results: Alg. On XOR problem Number of Iteration BP 6, 587 [1] CC (CC 1) 8 [1] CC (CC 2) 1 [1] 23
CC 2’s Generalization…[3] n Hidden neurons’ weight are: n n n r is the radius of the generalized region If no generalization is needed then r = 0. For function mapping, where the input vectors are equally distributed into the 0 and the 1 classes, then: 24
About choice of h[3] n n consider a 2¡dimensional problem: The function of the hidden node can be expressed by the separating line: 25
About choice of h[3] n Assume that the input pattern being classified is (0 1), then x 2 = 1. Also, w 1 = h, w 2 = 1, and s = 1. The equation of the dividing line represented by the hidden node now becomes: 26
About choice of h… (h=-1 and r=0) 27
About choice of h… (h=-0. 8 and r=0) 28
About choice of h… (h=-1 and r=1) 29
CC 4 [6] n n The CC 4 network maps an input binary vector X to an output vector Y. The input and output layers are fully connected. The neurons are all binary neurons with binary step activation function as follows: The number of hidden neurons is equal to the number of training samples with each hidden neuron representing one training sample. 30
CC 4 Training[6] n n n Let be the weight of the connection from input neuron i to hidden neuron j and let be the input for the i-th input neuron when the j-th training sample is presented to the network. Then the weights are assigned as follows: 31
CC 4 Training [6]… n n n Let be the weight of the connection from j-th hidden neuron to the k -th output neuron. let be the output of the k-th output neuron for the j-th training sample. The value of are determined by the following equation: 32
Sample of CC 4 n n Consider The 16 by 16 area of a spiral pattern that contains 256 binary pixel (as black and white) as figure 2. . And we want to train a system with 1 exemplar sample as figure 2 that total 75 point are used for training. Figure 1 Figure 2 33
Sample of CC 4… 16 n n n We can code 16 integer numbers with 4 binary bits. Therefore for location (x, y), we will use 4 bits for x and 4 bits for y, and 1 extra bit (always equal to 1) for the bias. Totally we have 9 inputs. 16 34
Sample of CC 4… 0 -1 1 1 0 -1 1 # corner (5, 6) -1 -1 0 1 1 -1 0 r-s+1=r-3+1=r-2 0 corner 35
Sample of CC 4 result… Original spiral n Training sample Number of point classified /misclassified in the spiral pattern. Output, r=2 Output, r=3 Output, r=1 Output, r=4 36
FC motivation Disadvantages of CCs algorithm n n Input and output must be discrete Input is best presented in a unary code increases the number of input neurons considerably n Degree of generalization for all nodes is the same 37
Problem n n In reality this degree vary from node to node We need to work on real data An interative version of the CC algorithm that does provide a varying degree of generalization has been devised. Problem : It is not instantaneous 38
Fast classification network What is FC? n n n a generalization of the CC network This network can operate on real data directly Learn instantaneously It reduces to CC in a way that : n n data is binary amount of generalization is fixed 39
Input X=( x 1, x 2, …, xk ) , F(x) Y • All xi and Y are real data • K is determined by problem nature §What to do §Define weight for input & output weight §Define radius of generalization 40
Input Index 1 Input Output x 1, x 2, x 3, x 4 Y 1, Y 2 2. . x 1, x 2, x 3, x 4 Y 1, Y 2 41
FC network structure 42
The hidden neurons 43
The rule base Rule 1: IF m = 1, THEN assign μi usingle-nearest-neighbor (1 NN) Rule 2: IF m = 0, THEN assign μi using k-nearest-neighbor (k. NN) heuristic. M=the number of hi that equal to 0 • value of k is typically a small fraction of the size of the training set. • Membership grades are normalized, 44
1 NN heuristic n when exactly one element in the distance vector h is 0 45
k. NN heuristic Based on k nearest neighbors. Triangular membership 46
Training of the FC network Training involves two separate step: n n Step 1: input and output weights are prescribed simply by inspection of the training input/output pairs Step 2: the radius of generalization for each hidden neuron is determined ri=1/2 dmin i 47
Radius of generalization hard generalization with separated decision regions Soft generalization together with interpolation 48
Generalization by fuzzy membership The output neuron then computes the dot product between the output weight vector and the membership grade vector 49
Other consideration n Other membership function. quadratic function known as S 50
Other consideration n Other distance metric. city block distance. . Result : performance of the network is not seriously affected by the choice of distance metric and membership function 51
Hidden neuron n n As in CC 4: Number of training samples that the network is required to learn. Note: training sample are exemplar 52
Example d 23 = 11. 27 r 1=2. 5 r 2=2. 5 r 3=5 53
Example d 23 = 11. 27 r 1=2. 5 r 2=2. 5 r 3=5 Input : Y=0. 372*7 + 0. 256*4 + 0. 372*9 =6. 976 54
Experimental result Time-series prediction n electric load demand n Forecast n Traffic volume forecast n Prediction of stock prices, currency, and interest rates describe the performance of the FC network using two benchmark With different characteristic n Henon map time series n Mackey–Glass time series 55
Henon map Generated point Training samples Testing samples Window size 544 500 out of 504 50 4 Input X Output Y X(1), X(2), X(3), X(4) one-dimensional Henon map: X(5) X(2), X(3), X(4) , X(5) X(6) 56
Henon map time-series prediction using FC (4 -500 -1), k = 5. 57
Result Henon map time-series prediction using FC network SSE : sum-of-squared error 58
Mackey-Glass time series nonlinear time delay differential equation originally developed for modeling white blood cell production. A, B, C : constants D : the time delay parameter. Popular case : A B C 0. 2 0. 1 10 D 30 59
Henon map time-series prediction using FC (4 -500 -1), k = 5. 60
PERFORMANCE SCALABILITY §FC network and RBF network are optimized for a sample size of 500 and window size of 4. §Parameter such as spread constant for RBF are set to the best value Then The window and the sample size are allowed to change without reoptimization 61
PERFORMANCE SCALABILITY 62
PERFORMANCE SCALABILITY 63
Result n n performance of the FC network remains good and reasonably consistent throughout all window and sample sizes RBF network is adversely affected by changes in the window size or sample size or both Conclusion n n The performance of the RBF network can become erratic for certain combinations of these parameters. FC is generally applicable to other window sizes and sample sizes 64
Pattern recognition n n pattern in a 32 -by-32 area Input : row and column coordinates of the training samples [1, 32] Two output neurons, one for each class White region : (1, 0) black region : (0, 1) 65
Result Two-class spiral pattern classification Input neuron Training sample Output neuron 66
Result Four-class spiral pattern classification Input neuron Training sample Output neuron 67
References [1] S. C. Kak, On training feedforward neural networks. Pramana -J. of Physics, 40, 35 -42 (1993). [2] G. Mirchandani and W. Cao, On hidden nodes for neural nets. IEEE Trans. on Circuits and Systems 36, 661 -664 (1989). [3] S. Kak (1998), “On generalization by neural networks”, Information Sciences, vol. 111, pp. 293 -302. [4] S. Kak, Better web searches and prediction with instantaneously trained neural networks, IEEE Intelligent Systems, vol. 14(6), pp. 78– 81, 1999. [5] CHAPTER 7 , RESULTS AND DISCUSSION [6] Bo Shu, Subhash Kak, A neural network-based intelligent metasearch engine , Information Sciences, 120 (1999)1 -11 68
References n n n [7] S. Kak (2002), “A class of instantaneously trained neural networks”, Information Sciences, vol. 148, pp. 97 -102. [8] K. W. Tang and S. Kak (2002), “Fast Classification Networks for Signal Processing”, Circuits Systems Signal Processing, vol. 21, pp. 207 -224. [9] S. Kak, “Three languages of the brain: Quantum, reorganizational, and associative, “ In Learning as Self. Organization, K. Pribram and J. King, eds. , Lawrence Erlbaum, Mahwah, N. J. , 1996, pp. 185 --219. 69
- Limitations of perceptron
- Adaptive learning neural network
- Grossberg neural network
- Neural network in r
- Decision boundary of neural network
- Freed et al (2001 ib psychology)
- Maxnet neural network
- Neural network data preprocessing
- Extensions of recurrent neural network language model
- Jmp neural network
- Tlu neural network
- Least mean square algorithm in neural network
- Spiking neural network tutorial
- Playground tensor flow
- Nnlm neural network
- Neural network in data mining
- Neural network matlab toolbox
- Vc dimension of neural networks
- Ann unsupervised learning
- Neural networks and learning machines 3rd edition
- Reinforcement learning blackjack
- Colah lstm
- Neural network design solution
- Meshnet: mesh neural network for 3d shape representation
- Playground tensorflow
- Weka neural network
- Artificial neural network conclusion
- Feature map in cnn
- Bam network
- Spss neural network
- Neural network tic tac toe
- Transformer tts
- Student teacher deep learning
- Neural tensor network
- Andrew ng lstm
- What is stride in cnn
- Multilayer neural network
- Pengertian artificial neural network
- Instar law
- Difference between adaline and perceptron
- Neural network backpropagation example
- Alternatives to convolutional neural networks
- Xkcd neural network
- Recurrent neural network based language model
- Artificial neural network terminology
- Cost function in deep learning
- Artificial neural network in data mining
- Deep learning playground
- Graph neural network lecture
- Dr. mehdi pain management
- Mehdi hamadani
- Mehdi nt
- Mehdi bouguerra
- Mehdi
- Mehdi namazi
- Azure technical trainer
- Sodium deficit correction formula
- R-tree java
- Mehdi namazi
- Mehdi bouguerra
- Mehdi salek md
- Ang namumuno sa bansang iran upang makamit ang kalayaan
- Mehdi salek md
- Dr mehdi hasan
- Temuriylar davri madaniyati ppt
- Kuran d
- Amir jahed method
- Chorée de sydenham
- Amir levinson
- Dr doron amir