Data mining with Data Shop Ken Koedinger CMU

  • Slides: 36
Download presentation
Data mining with Data. Shop Ken Koedinger CMU Director of PSLC Professor of Human-Computer

Data mining with Data. Shop Ken Koedinger CMU Director of PSLC Professor of Human-Computer Interaction & Psychology Carnegie Mellon University

“Knowledge components are the germ of transfer” Goal of the week: What does Ken

“Knowledge components are the germ of transfer” Goal of the week: What does Ken mean by this?

Overview n Motivation for data mining q n Exploratory Data Analysis q n n

Overview n Motivation for data mining q n Exploratory Data Analysis q n n Better understanding of students => better instructional design Data Shop demo, Excel Learning curves & Learning Factors Analysis Example project from last summer

Data Mining Questions & Methods n What is going on with student learning &

Data Mining Questions & Methods n What is going on with student learning & performance? q Exploratory data analysis n n n Summary & visualization tools in Data. Shop Tools in Excel: Auto filter, Pivot Tables, Solver How to reliably model student achievement? q Item Response Theory (IRT) n n Basis for standardized tests, SAT, GRE, TIMSS… Version of “logistic regression”

Data Mining Questions & Methods 2 n What’s the nature of knowledge students are

Data Mining Questions & Methods 2 n What’s the nature of knowledge students are learning? How can we discover cognitive models of student learning that fit their learning curves? q n Learning Factors Analysis (LFA) n Extends IRT to account for learning n Search algorithm: Discover cognitive model(s) that capture how student learning transfers over tasks over time What features of a tutor lead to the most learning? q Learning Decomposition n n Extends LFA to explore different rates of learning due to different forms of instruction How to extract reliable inferences about causal mechanisms from correlations in data? q Causal modeling using Tetrad

Overview n Motivation for data mining q n Exploratory Data Analysis q n n

Overview n Motivation for data mining q n Exploratory Data Analysis q n n Better understanding of students => better instructional design Next Demo: Data. Shop, Excel Learning curves & Learning Factors Analysis Example project from last summer

Data Shop Demo …

Data Shop Demo …

Before going to Data. Shop, let’s look at a tutor (1997 version!) that generated

Before going to Data. Shop, let’s look at a tutor (1997 version!) that generated the example data set we’ll look at

TWO_CIRCLES_IN_SQUARE problem: Initial screen

TWO_CIRCLES_IN_SQUARE problem: Initial screen

TWO_CIRCLES_IN_SQUARE problem: An error a few steps later

TWO_CIRCLES_IN_SQUARE problem: An error a few steps later

TWO_CIRCLES_IN_SQUARE problem: Student follows hint & completes prob

TWO_CIRCLES_IN_SQUARE problem: Student follows hint & completes prob

How to get to the Data. Shop: Go to http: //learnlab. org & click

How to get to the Data. Shop: Go to http: //learnlab. org & click … 2 1 3

PSLC’s Data. Shop n Researchers get data access, visualizations, statistical tools n Learning curves

PSLC’s Data. Shop n Researchers get data access, visualizations, statistical tools n Learning curves track student learning over time n Discover what concepts & skills students need help with

PSLC’s Data. Shop n n Learning curves reveal over- and under-practiced knowledge components Rectangle-area

PSLC’s Data. Shop n n Learning curves reveal over- and under-practiced knowledge components Rectangle-area has an initial low error rate, but is practiced often

Other Data. Shop Features n n n Error Reports q Identify misconceptions by looking

Other Data. Shop Features n n n Error Reports q Identify misconceptions by looking for common student errors q When do students ask for hints? q Are there alternative correct strategies? Performance Profiler Export Data q Get all or part of the data in tab-delimited file q Use your favorite analysis tools …

Exported File Loaded into Excel

Exported File Loaded into Excel

Overview n Motivation for data mining q n Exploratory Data Analysis q n n

Overview n Motivation for data mining q n Exploratory Data Analysis q n n Better understanding of students => better instructional design Data Shop demo, Excel Next Learning curves & Learning Factors Analysis Example project from last summer

Cognitive Model drives behavior of intelligent tutor systems … n Cognitive Model: expert component

Cognitive Model drives behavior of intelligent tutor systems … n Cognitive Model: expert component of intelligent tutors that models how students solve problems If goal is solve a(bx+c) = d Then rewrite as abx + ac = d 3(2 x - 5) = 9 If goal is solve a(bx+c) = d Then rewrite as abx + c = d If goal is solve a(bx+c) = d Then rewrite as bx+c = d/a 6 x - 15 = 9 n 2 x - 5 = 3 6 x - 5 = 9 Model Tracing: Follows student through their individual approach to a problem -> context-sensitive instruction

Cognitive Model drives behavior of intelligent tutor systems … n Cognitive Model: expert component

Cognitive Model drives behavior of intelligent tutor systems … n Cognitive Model: expert component of intelligent tutors that models how students solve problems If goal is solve a(bx+c) = d Then rewrite as abx + ac = d 3(2 x - 5) = 9 If goal is solve a(bx+c) = d Then rewrite as abx + c = d Hint message: “Distribute a across the parentheses. ” Known? = 85% chance 6 x - 15 = 9 n n Bug message: “You need to multiply c by a also. ” Known? = 45% 2 x - 5 = 3 6 x - 5 = 9 Model Tracing: Follows student through their individual approach to a problem -> context-sensitive instruction Knowledge Tracing: Assesses student's knowledge growth -> individualized activity selection and pacing

Cognitive Modeling Challenge n Problem: Intelligent Tutoring Systems depend on Cognitive Model, which is

Cognitive Modeling Challenge n Problem: Intelligent Tutoring Systems depend on Cognitive Model, which is hard to get right q q q Hard to program, but more importantly … A high quality cognitive model requires a deep understanding of student thinking Cognitive models created by intuition are often wrong (e. g. , Koedinger & Nathan, 2004)

Significance of improving a cognitive model n A better cognitive model means: q q

Significance of improving a cognitive model n A better cognitive model means: q q n better feedback & hints (model tracing) better problem selection & pacing (knowledge tracing) Making cognitive models better advances basic cognitive science

How can we use student data to build better cognitive models? n Cognitive Task

How can we use student data to build better cognitive models? n Cognitive Task Analysis methods q Think alouds, Difficulty Factors Assessment n q Peer collaboration dialog analysis n q General lecture Tuesday Tag. Helper track Newer: n Data mining of student interactions with on-line tutors

Back to Data. Shop to illustrate

Back to Data. Shop to illustrate

Use log data to test alternative knowledge representations n n Which “knowledge component” analysis

Use log data to test alternative knowledge representations n n Which “knowledge component” analysis is correct is an empirical question! Log data from tutors provides data to compare different KC analyses q Find which “germ” accounts for student learning behaviors

Not a smooth learning curve -> this knowledge component model is wrong. Does not

Not a smooth learning curve -> this knowledge component model is wrong. Does not capture genuine student difficulties.

This more specific knowledge component (KC) model (2 KCs) is also wrong -- still

This more specific knowledge component (KC) model (2 KCs) is also wrong -- still no smooth drop in error rate.

Ah! Now we are getting a smooth learning curve. This even more specific decomposition

Ah! Now we are getting a smooth learning curve. This even more specific decomposition (12 KCs) better tracks the nature of student difficulties & transfer for one problem situation to another.

Overview n Motivation for data mining q n Exploratory Data Analysis q n n

Overview n Motivation for data mining q n Exploratory Data Analysis q n n Better understanding of students => better instructional design Demo: Data. Shop, Excel Learning curves & Learning Factors Analysis Example project from last summer Next

Example project from 2006 n n Rafferty (Stanford) & Yudelson (U Pitt) Analyzed a

Example project from 2006 n n Rafferty (Stanford) & Yudelson (U Pitt) Analyzed a data set from Geometry Applied Learning Factors Analysis (LFA) Driving questions: q q Are students learning at the same rate as assumed in prior LFA models? Do we need different cognitive models (KC models) to account for low-achieving vs. highachieving students?

Rafferty & Yudelson Results 1 n n Different student learning rates? Yes

Rafferty & Yudelson Results 1 n n Different student learning rates? Yes

Rafferty & Yudelson Results 2 n Is it “faster” learning or “different” learning? q

Rafferty & Yudelson Results 2 n Is it “faster” learning or “different” learning? q q n Fit with a more compact model is better for low pre for high learn Students with an apparent faster learning rate are learning a more “compact”, general and transferable domain model (Became basis of Anna Rafferty’s masters thesis)

Data Mining-Data Shop Offerings Tomorrow Lectures in 3501 Newell-Simon Hall, activities here (Wean 5202)

Data Mining-Data Shop Offerings Tomorrow Lectures in 3501 Newell-Simon Hall, activities here (Wean 5202) 1. Educational data mining overview & introduction to using the Data. Shop q Follow-up activities: n n Exercise in using Data. Shop for exploratory data analysis Use tutor/course that generated target data set. Begin data export, data scrubbing, exploratory data analysis 2. Learning from learning curves: Item Response Theory, Learning Factors Analysis 3. Other data mining techniques: Learning decomposition, causal models with Tetrad n Define metrics to address driving question, begin analysis

Questions?

Questions?

What’s next? n Tomorrow: q q Do you know which offerings you will go

What’s next? n Tomorrow: q q Do you know which offerings you will go to tomorrow? Any conflicts -- two you want to go to that are at the same time?

END

END