A ParameterBased Computational Model for LongTerm Episodic Memory
A Parameter-Based Computational Model for Long-Term Episodic Memory Yousef Alhwaiti, Mohammad Z. Chowdhury, Abu Kamruzzaman, and Charles C. Tappert Seidenberg School of CSIS, Pace University, New York Summary of Ph. D dissertation by Yousef Alhwaiti, advisor Dr. Tappert Presented at Symposium on Artificial Intelligence (CSCI-ISAI), Dec 2019
Key Ideas n Human Brain Modeling with Deep Learning n Deep Learning has modeled visual and auditory systems n n Little computing research work on human memory systems Long-Term Declarative Episodic Memory n n Convolutional Neural Networks winning image recognition contests Recurrent Neural Networks in Siri & Alexa answer spoken questions Rosenblatt’s Never-implemented Episodic Memory Model System Design n n Combine CNN recognition & Rosenblatt’s clock memory Develop one-shot training algorithm to train weights from clock system to hidden layer of CNN system
Major Research Contributions n n Design and analysis of a probabilistic parameter-based computational model of long-term episodic memory Creation of a unique one-shot training algorithm for memory recall over a human lifespan n With decreased memory recall for events receding into past Method of estimating the memory recall accuracy for various parameter settings over a human lifespan Experiments conducted on a variety of popular image databases show model not database dependent
Long-Term Episodic Memory n Atkinson-Shiffrin model of memory n Sensory memory n n Short term memory n n n Capacity: 5 -9 items Duration: less than one minutes Long term memory n n n Capacity: unlimited Duration: less than one second Acting as a filter Capacity: unlimited Duration: lifetime Forgetting
Long-Term Episodic Memory recalls personal experiences over a lifetime n Includes information about recent or past events and experience n The recall of experiences is contingent on three steps of memory processing: n Encoding n Storing n Retrieval n
Proposed Deep Learning Model n Simulates human memory system by using n n CNN Deep Learning image recognition system “Clock memory” system records the sequence of images presented to the recognition system n n One-shot training algorithm trains weights from clock system to hidden layer of CNN system In memory recall mode, the clock is reset to recall the sequence of images
Typical CNN Deep Learning Recognition System
Rosenblatt’s “Clock Memory” Model
Proposed Deep Learning Model
“Clock-memory” System Details n The clock-memory has n total clock units n n At any given time, there a active clock units n n Each clock neuron can be either ON or OFF The units which are active are chosen randomly Recognition system sent sequence of s images System parameters: n, a, s The clock-system works as follows. When the recognition system receives a new input, it triggers a clock tick and a subset of active clock neurons are randomly chosen. 10
One-shot Training Procedure n n A one-shot training procedure adjusts the weights from the active clock units to reproduce the activity of the hidden layer which, in turn, regenerates the desired output Why one-shot training n While most deep learning network require extensive training, humans can learn from one or few examples
One-shot Weight Adjustment n n n Step 1: compute the input from the clock activity vector to each hidden layer unit –the clock-unit activity vector times the current weight matrix Step 2: compute the differences between the values computed at step one and the desired values, current activity of the hidden layer units Step 3: to distribute the differences among the weights from the active clock units, divide the differences by the number of active clock-units and add or subtract the divided differences from the corresponding weights to obtain the new weight matrix
MNIST Dataset n n Set of handwritten digits 60, 000 samples for training and 10, 000 for testing Samples are gray-scale 28 x 28 pixel images We used MNIST dataset for sequence to be remembered in the experiments
Results: Memory Recall Accuracy Recall accuracy versus a for various n, fixed s=100 Best to have a<<n a= 14
Example to Explain Effect of Active Units Parameters are s = 3, n = 3, and a = 1 1 0 0 0 1 Clock-system
Example to Explain Effect of Active Units Parameters are s = 3, n = 3, and a = 1 1 1 0 0 1 1 1 0 1 Clock-system
Experiment Results Recall accuracy versus s for various n, fixed a=2 s=
Experiments on probability of common a units § § Common active units occur when two or more clock vectors in the sequence share the same active units. Suppose n=5, a=2 Clock vector 1= [0, 0, 1] Clock vector 2= [1, 0, 0, 0, 1] In this case, the last unit in these two vectors are both active and are shared
Experiments on probability of common a units Accuracy: Prob(common units) versus sequence length Parameters: s={100, 500, 1000}, a=10, n=2000
Experiments on probability of common a units Accuracy: Prob(common units) versus total units Parameters: a=5, s=100, n={500, 1000, 2000, 4000} 15 random seeds to smooth results
Experiments on probability of common a units Acc: Prob(common units) versus active units Parameters: a={1, 2, 5, 10, 20}, s=100, n=200 15 random seeds to smooth results
Experiment: memories fading away Memory recall decreases as memories recede into past Recall accuracy within 1 st tenth, 2 nd tenth, …, 10 th tenth of sequence, n=500, a=2, s ={200, 400, 800, 1600}
Estimating human lifespan memory recall Memory recall accuracy depends on s & n: n/s=constant
Estimating recall for human lifespan Assume humans have a memory event every second of their lives, 3. 15× 109 sec in generous 100 -year lifespan Because s/n=constant for memory recall accuracy, we can predict recall accuracy for any sequence length using smaller sequences n n n/s=64000/10000=6. 4 Pick a small sequence such as 100 Calculate n for small sequence n=100*6. 4=640 Now, we have the inputs s=100, a=2, n=640
Estimating recall for human lifespan For For For 24% 42% 55% 75% 81% 90% recall recall accuracy, accuracy, n=(3. 15× 109)/4 = 7. 8× 108 n=(3. 15× 109)/2 = 1. 5× 109 n=(3. 15× 109) = 3. 15× 109 n=2*(3. 15× 109) = 6. 3× 109 n=4*(3. 15× 109) = 1. 2× 1010 n=5*(3. 15× 109)/4= 1. 5× 1010 Average recall of 24% is good, higher for recent recall There about 100 billion or 1011 neurons in a brain, so allocating one hundredth of the brain cells to memory seems reasonable
Experiments on A_Z Handwritten Alphabet contains 372451 handwritten characters Grayscale image size 28*28 pixels It has 26 categories Test inputs: n=200, a=2, and s = {100, 200, 400, 800}
Experiments on Fashion MNIST has 70, 000 labeled images Grayscale image size 28*28 pixels It has 10 categories Test inputs: n=200, a=2, and s = {100, 200, 400, 800}
Experiments on CFAR-10 Dataset CFAR-10 has 60, 000 RGB images The image size is 32*32 pixels It has 10 categories Test inputs: n=200, a=2, and s = {100, 200, 400, 800}
Major Research Contributions n n Design and analysis of a probabilistic parameter-based computational model of long-term episodic memory Creation of a unique one-shot training algorithm for memory recall over a human lifespan n With decreased memory recall for events receding into past Method of estimating the memory recall accuracy for various parameter settings over a human lifespan Experiments conducted on a variety of popular image databases show model not database dependent
Recommendations for Future Work n Memories of significant events in one’s lifetime are the strongest memories n Find a computational method to determine significant events n n n Significant memories are reviewed/reinforced during dream cycles Modifying the current algorithm to recall the significant events with strong recall accuracy like recent memories Current model recalls a lifespan of memories n n Recall mechanism recalls from the beginning Modify the current mechanism with the capability of random access recall of past events
- Slides: 30