Learning decomposition WARNING Goals Understand what learning decomposition
- Slides: 32
Learning decomposition WARNING
Goals • Understand what learning decomposition is – And basic intuition • See how it was applied to a variety of problems • Think about how to apply it to your data
Introduction to Reading Tutor • More free-form than most of the cognitive tutors • Random interventions • Kids or tutor can initiate help • Turn taking • Never quite sure what student is trying to do
Project LISTEN’s Reading Tutor
What is a practice opportunity? (and are they all equally valuable? ) Before story, tutor teaches ‘elephant’ Student sees word ‘elephant’ in sentence Student clicks for help on it Student reads it ‘Elephant’ occurs twice in the next sentence • How many practice opportunities? • Did instruction have any benefit? • Did seeing word immediately afterwards help?
Procedure • Determine (hopefully motivated) learning decompositions • Find data that reflect learning • Solve as a non-linear regression model – Fit model to each student • Interpret model coefficient
Question: Does learner control result in more learning? • In Reading Tutor, students pick story half the time Tutor picks other half • Tutor selects stories much faster than student • Suspect motivational benefit from learner control (willing to tolerate system over a school year) • Is there a cognitive benefit? • Compare learning of words that occur in studentvs. tutor-selected stories.
Find data that reflect learning • Students perform many actions • Only want those that indicate “real” learning to count • Assumptions – First opportunity each day is purest marker • Albert, Ken, and Joe have all observed difficulties with closely space trials – Don’t count stories student has already read • Need outcome measure – Fuse accuracy, speed, and help performance
Approach Asked for Reading Day help? time (Sec) Prior opportunities Student selected Tutor selected Outcome 1 1 1 Yes No No 0. 4 0. 5 0 1 2 0 0 0 3. 0 2 No - 3 0 3. 0 2 3 No No 0. 4 0. 3 3 3 1 2 0. 3
Procedure • Determine (hopefully motivated) learning decompositions • Find data that reflect learning • Solve as a non-linear regression model – Fit model to each student • Interpret model coefficient (B)
Better Worse Learning curves Performance = Ae-bt Input: number of prior trials (t) Output: expected performance
What if all trials aren’t equal? • Normal model = Ae-bt • Think about student vs. tutor chosen story – t 1 = trials where student chose story – t 2 = trials where tutor chose story • Learning decomp model = Ae -b(t 1+B*t 2) – B determines relative efficacy of trials of type t 1 and t 2
Use regression to find relative weight of tutor-selected prior opportunities Asked for Reading Day help? time (Sec) Prior opportunities Student selected Tutor selected Outcome 1 1 1 Yes No No 0. 4 0. 5 0 1 2 0 0 0 3. 0 2 No - 2 0 3. 0 2 3 No No 0. 4 0. 3 2 2 1 2 0. 3
Fit model to each student’s data Student B Chris Smith 0. 3 Pat Johnson 1. 2 Sam Jackson 0. 5 Jessie Stevens 0. 9 Reagan Ronald 0. 7
Interpret B parameter • B is scaling parameter – B>1 students benefit from tutor control – B 1 no benefit either way – B<1 student control is better • B 0. 8 for tutor-chosen stories (median) – Students learn more from student chosen stories (not my H 0) • What could be other causes of result?
Which students benefit? • Top-down approach: – Think of plausible subgroups – See how/if B varies among them • E. g. 1 st grader had 0. 98, 2 nd graders 0. 89, and 3 rd graders 0. 49 – Suggests older kids benefit more (getting pickier? ) • Many possibilities, want to avoid fishing expedition
Which students benefit? Bottom-up approach • Use regression results as training labels for classifier • Predictors: – – Gender Grade Test score (grade normed) Disability status • Boys benefit from learner control Student B Benefits from tutor control? Chris Smith 0. 3 No Pat Johnson 1. 2 Yes Sam Jackson 0. 5 No Jessie Stevens 0. 9 ? Reagan Ronald 0. 7 No
Other learning decompositions: practice effects • Open debate if more learning from rereading stories or reading new stories • Generally believed spaced practice better for long term retention (but not short) • Results – Reading new material better than rereading old stories (B = 0. 5) – Later practice opportunities on same day are ineffective (B = 0. 2)
Other learning decompositions: impact of instruction • Reading Tutor has a bunch of random bits of instruction • Do they do anything? – Solution: model instruction as an encounter and give it a weight • Impact of instruction (in progress) – Spelling intervention worth 0. 75 exposures – Word ID intervention worth 0. 36 exposure – Neither is particularly effective – (but, first analytic approach to find any effect)
Using learning decomposition to model transfer (Xiaonan Zhang) • How do students represent words? – Naïve model: words are independent – What about “cat” vs “cats”? • Alternate models: – Word roots (cats, cat CAT) – Rimes (bat, cat AT) • T 1 = # prior times have read word • T 2 = # prior times have read root • T 3 = # prior times have read rime • Substantial transfer at level of word root – 55% as good as seeing the word itself
Hopefully • Understand approach – Think of two types of learning that may have unequal impact – Divide up trials – Perform curve fitting • See that it applies to variety of problems • But…
Concerns • We say things like “rereading is not as effective as reading different stories” – Is it safe to make causal inference from observational data? • Wide- vs. Re-reading: troublesome – What if lower proficiency is true cause? • Massed vs. Distributed practice: ok (? ) • Student vs. Tutor control: ok • Interventions: ok • What about student initiated help?
Interesting view (Jack Mostow) • Each student has a B parameter • E. g. Chris Smith has B=0. 3 for rereading – Chris Smith learns 30% as much from rereading as wide reading – Impossible for traits of Chris Smith to be a confound (proficiency, disability, etc. ) – But, states could still be a problem • E. g. Chris only rereads after sleeping poorly
Compare LFA and Learning Decomposition • Similar: – Use learning curves and performance data – Insight: a model that better predicts student performance is a better model of student’s mental processes (modulo complexity) • Different: – Bottom-up vs. top-down – Each manipulates different aspect of representation
Bottom-up vs. Top-down • Learning decomposition – Start with theory-driven idea – Estimate effect (if any) – No search • LFA – Start with variety of factors – Perform search – Might not correspond to higher level construct • Not necessarily a bad thing
Consider transfer at level of word roots • Learning decomp: – Student exposure to words of same root is 55% as good as seeing the word • i. e. cats, cat, cat • i. e. accepts, accept, accept • LFA – Cats and cat are same skill (perfect transfer) • i. e. cats, cat > cat, cat – Accepts and accept are different skills • i. e. accepts, accept < accept, accept
Consider transfer at level of word roots • Learning decomp: – Student exposure to words of same root is 55% as good as seeing the word • i. e. cats, cat = cat, cat • i. e. accepts, accept = accept, accept • LFA – Cats and cat are same skill (perfect transfer) • i. e. cats, cat > cat, cat – Achieve and achieving are different skills • i. e. accepts, accept < accept, accept
Consider transfer at level of word roots • Learning decomp: – Student exposure to words of same root is 55% as good as seeing the word • i. e. cats, cat = cat, cat • i. e. accepts, accept = accept, accept • LFA – Cats and cat are same skill (perfect transfer) • i. e. cats, cat > cat, cat – Achieve and achieving are different skills • i. e. accepts, accept < accept, accept
Student learning history Skill Prior practice opportunities Skill 1 0 Skill 1 1 Skill 1 2 Skill 2 0 Skill 3 Skill 1 Skill 2 0 3 1
Learning factors analysis Skill Prior practice opportunities Skill 1 0 Skill 1 1 Skill 1 2 Skill 2 0 Skill 3 Skill 1 Skill 2 0 3 1 Did student utilize skill 1 here? Is it better to think of it as skill 1’?
Learning decomposition Skill Prior practice opportunities Skill 1 0 Skill 1 1 Skill 1 2 Skill 2 0 Skill 3 Skill 1 Skill 2 0 3 1 Did the student really have 3 prior practice opportunities? 1+1+1 = 3, but is there a better way of counting?
Wrapup • Why model individual points • Scope of learning decomposition • How learning decomp differs from LFA
- Strategic goals tactical goals operational goals
- Strategic goals tactical goals operational goals
- To understand recursion you must understand recursion
- General goals and specific goals
- Motivation in consumer behaviour
- Cuadro comparativo e-learning m-learning b-learning
- Big picture learning goals
- Amt learning goals
- Discouraging criminal acts by threatening punishment
- Establish mathematics goals to focus learning
- Planning goals and learning outcomes
- International primary curriculum
- Chapter 2 learning goals outline sociology answers
- Wicew
- Kfupm graduate studies
- Schkin
- Warning expressions
- Alert warning symbols and controls worksheet
- Module 10 topic 4 drivers ed
- Conclusion on tsunami
- Controlled access zone vs warning line
- Warning order example usmc
- Curable stds
- Alert/warning symbols and controls
- Forex risk warning
- Dupage tornado warning
- Percentage calculations worksheet
- Dewarmpt
- Warning signs of a seizure
- Warno example
- Turnbull directions
- Identify a warning sign of a weak talent bench
- Early warning intervention and monitoring system