BUILDING A PREDICTIVE MODEL A BEHIND THE SCENES
BUILDING A PREDICTIVE MODEL A BEHIND THE SCENES LOOK Mike Sharkey Director of Academic Analytics, The Apollo Group January 9, 2012
THE 50, 000 FT. VIEW We have lots of data; we need to set a good foundation… …so we can extract information that will help our students succeed
OUR DATA FOUNDATION
INTEGRATED DATA WAREHOUSE Applications Applicant SIS Integrated Data Repository LMS CMS Databases Reporting Tools Analytics Tools Business Intelligence
HOW IS IT WORKING? Disadvantages Advantages § Continuous flow of integrated data § New data flows require in-demand resources § Can drill down to the transaction level § Need skilled staff to understand the data model
BUILDING A PREDICTIVE MODEL
PREDICTING SUCCESS… …BUT WHAT IS SUCCESS? Learning ? Did the students learn what they were supposed to learn? Student drops out Program persistence Course completion Student passes class
THE PLAN § Use available data to build a model (logistic regression) § § Demographics, schedule, course history, assignments Develop a model to predict course pass/fail § e. g. scale of 1 -10 10 will likely pass the course § 1 will most likely fail the course § § Feed the score to academic counselors who can intervene (phone at-risk students)
6. 00 5. 00 THE MODEL 4. 00 3. 00 2. 00 1. 00 ee W k 5 Course assignment scores (stronger as course goes on) § Financial status (mostly at Week 0) § Did the student fail courses in the past § Credits earned in the program (tenure) § k 4 Strongest predictive coefficients ee W k 3 ee W k 2 ee k 1 § k 0 Associates, Bachelors, Masters § Predict at Week 0, Week 1, … to Week (last) § W ee Built different models W § 0. 00
WHERE WE ARE TODAY § Validation § § The statistics are sound, but we need to field test the intervention plan to validate the model scores What we learned The strongest parameters are the most obvious (assignments) § Weak parameters: gender, age, weekly attendance § § Add future parameters as available § Class activity, participation, faculty alerts, inactive time between courses, interaction with faculty, orientation participation, late assignments
THANK YOU! Mike Sharkey mike. sharkey@phoenix. edu 602 -557 -3532
5 CHALLENGES IN BUILDING & DEPLOYING LEARNING ANALYTICS SOLUTIONS Christopher Brooks (cab 938@mail. usask. ca)
MY BIASES A domain of higher education Scalable and broad solutions The grey areas between research and production
QUESTION: YOUR BIASES: WHAT DO YOU THINK THE PRINCIPAL GOAL OF LEARNING ANALYTICS SHOULD BE? Enabling human intervention § Computer assisted instruction (dynamic content recommendation, tutoring, quizzing) § Conducting educational research § Administrative intelligence, transparency, competitiveness § Other (write in chat) §
CHALLENGE 1: WHAT ARE YOU BUILDING Exploring data Intuition and domain expertise are useful Multiple perspectives from people familiar with the data More data types (diversity) is better, smaller datasets (instances) is ok Imprecision in data is ok Visualization techniques Answering a question Data should be cleaned and rigorous, with error recognized explicitly The quantity of data in the datasets (instances) strengthens the result Decision makers must guide the process (are the questions worth answering? ) Statistical techniques
CASE 1: HOW HEALTHY IS YOUR CLASSROOM COMMUNITY (SNA)
CASE 2: APPLYING SUPERVISED LEARNING TECHNIQUES (CLUSTERING)
RESULTS VALIDATED, QUANTIFIED, AND ENCOURAGED MORE INVESTIGATION Hypotheses H 1: There will be a group of minimal activity learners. . . H 2: There will be a group of high activity learners. . . H 3: There will be a group of disillusioned learners. . . H 4: There will be a group of deferred learners. . .
CHALLENGE 2: WHAT TO COLLECT Too much versus too little Make a choice based on end goals Think in terms of events instead of the “click stream” Collecting “everything” comes with upfront development costs and analysis costs The risk is the project never gets off the ground Make hypotheses explicit in your team so they can decide how best to collect that data Follow agile software development techniques (iterate & get constant feedback) Build institutional will with small targeted gains
CHALLENGE 3: UNDERSTAND YOUR USER Breadth of Context Administrator Rates for degree completion, retention rate, re-enrolment rate, number of active students. . . (Abbreviated statistics) Instructional Design/Researcher Educational researcher, what works and what doesn't tools and processes should change. . . (Sophisticated statistics & visualizations) Instructor Evaluation of students, of a cohort of students, and identifying immediate remediation. . . (Visualization, Abbreviated statistics) Student Evaluation, evaluation. . (Visualization)
WITH GREAT POWER COMES GREAT RESPONSIBILITY. . Some potential abuses of student tracking data Changing pedagogical technique to the detriment of some students Denying help to those who “aren't really trying” A failure of instructors to acknowledge the challenges that face students Is it ethical to give instructors access to student analytics data? Yes No Sometimes (write your thoughts in the chat)
CHALLENGE 4: ACKNOWLEDGE CAVEATS Analytics shows you a part of the picture only Dead tree learning, in-person social constructivism, shoulder surfing/account sharing Anonymization tools, javascript/flash blockers False positives (incorrect amazon recommendations) Misleading actions (incorrect self-assessment, or gaming the system (Baker)) Solutions Aggregation & anonymization Make error values explicit Use broad categories for actionable analytics
DOES LEARNER MODELLING OFFER SOLUTIONS? Learner modelling community blends with analytics. Open learner modelling (students can see their completed model) Scruitable learner modelling (students can see how the system model of them is formed) Question: I believe the student should have the right to view where analytics data about themselves has come from and who it has been made available to. Yes No Sometimes (and what are the implications on doing this? write in chat)
CHALLENGE 5: CROSS APPLICATION BOUNDARIES Data from different applications (clickers, lcms, lecture capture, SIS/CIS, publisher quizzes, etc. ) doesn't play well together Requires cleaning Requires normalizing on semantics Requires access Data warehousing activities Is there a light on the horizon?
QUICK CONCLUSIONS Thus far I've learned it's important to: Know your goals Know your user Capture what you know you need and don't worry about the rest Acknowledge limitations of your approach Iterate, iterate Christopher Brooks Department of Computer Science University of Saskatchewan cab 938@mail. usask. ca
LEARNING ANALYTICS FOR C 21 DISPOSITIONS & SKILLS Simon Buckingham Shum Knowledge Media Institute, Open U. UK simon. buckinghamshum. net @sbskmi
L. A. FRAMEWORK TO THINK WITH… Discipline knowledge Educator owns and manages a single dataset
L. A. FRAMEWORK TO THINK WITH… Discipline knowledge Educator owns and manages a single dataset Educator owns and manages multiple datasets
L. A. FRAMEWORK TO THINK WITH… Discipline knowledge Educator owns and manages a single dataset Educator owns and manages multiple datasets Learners add their own datasets
L. A. FRAMEWORK TO THINK WITH… Discipline knowledge Educator owns and manages a single dataset Educator owns and manages multiple datasets Learners add their own datasets Hybrid closed + open datasets
L. A. FRAMEWORK TO THINK WITH… Discipline knowledge Educator owns and manages a single dataset Educator owns and manages multiple datasets Learners add their own datasets Hybrid closed + open analytics
L. A. FRAMEWORK TO THINK WITH… Discipline knowledge Educator owns and manages a single dataset Educator owns and manages multiple datasets Learners add their own datasets Hybrid closed + open analytics Focus of most LA effort beginning to move towards these more complex spaces
L. A. FRAMEWORK TO THINK WITH… Discipline knowledge Educator owns and manages a single dataset Educator owns and manages multiple datasets Learners add their own datasets Hybrid closed + open datasets Focus of most LA effort beginning to move towards these more complex spaces Hybrid closed + open analytics http: //solaresearch. org/Open. Learning. Analytics. pdf
L. A. FRAMEWORK TO THINK WITH… Discipline knowledge Educator owns and manages a single dataset Educator owns and manages multiple datasets Learners add their own datasets Hybrid closed + open analytics Focus of most LA effort beginning to move towards these more complex spaces C 21 Learning Capacities critical for learner engagement, and authentic learning
LEARNING ANALYTICS FOR THIS? “We are preparing students for jobs that do not exist yet, that will use technologies that have not been invented yet, in order to solve problems that are not even problems yet. ” “Shift Happens” http: //shifthappens. wikispaces. com
LEARNING ANALYTICS FOR THIS? “The test of successful education is not the amount of knowledge that pupils take away from school, but their appetite to know and their capacity to learn. ” Sir Richard Livingstone, 1941
ANALYTICS FOR… C 21 SKILLS? LEARNING HOW TO LEARN? AUTHENTIC ENQUIRY? social capital critical questioning argumentation citizenship habits of mind resilience collaboration creativity metacognition identity readiness sensemaking engagement motivation emotional intelligence 38
L. A. FRAMEWORK TO THINK WITH… Discipline knowledge Educator owns and manages a single dataset Educator owns and manages multiple datasets Learners add their own datasets Hybrid closed + open analytics C 21 Learning Capacities Focus of most LA effort More LA effort needed beginning to move towards these more complex spaces e. g. 1. Disposition Analytics 2. Discourse Analytics
ANALYTICS FOR LEARNING DISPOSITIONS
ELLI: EFFECTIVE LIFELONG LEARNING INVENTORY WEB QUESTIONNAIRE 72 ITEMS (CHILDREN AND ADULT VERSIONS: USED IN SCHOOLS, UNIVERSITIES AND WORKPLACE) Buckingham Shum, S. and Deakin Crick, R (2012). Learning Dispositions and Transferable Competencies: Pedagogy, Modelling, and Learning Analytics. Accepted to 2 nd International Conference on Learning Analytics & Knowledge (Vancouver, 29 Apr – 2 May, 2012).
VALIDATED AS LOADING ONTO 7 DIMENSIONS OF “LEARNING POWER” Being Stuck & Static Changing & Learning Data Accumulation Meaning Making Passivity Critical Curiosity Being Rule Bound Isolation & Dependence Being Robotic Fragility & Dependence Creativity Learning Relationships Strategic Awareness Resilience
ELLI GENERATES A 7 -DIMENSIONAL SPIDER DIAGRAM OF HOW THE LEARNER SEES THEMSELF Basis for a mentoreddiscussion on how the learner sees him/herself, and strategies for strengthening the profile Bristol and Open University are now embedding ELLI in learning software. 43
ADDING IMAGERY TO ELLI DIMENSIONS TO CONNECT WITH LEARNER IDENTITY Milhouse
ELLI GENERATES COHORT DATA FOR EACH DIMENSION
…DRILLING DOWN ON A SPECIFIC DIMENSION
ENQUIRYBLOGGER: TUNING WORDPRESS AS AN ELLI-BASED LEARNING JOURNAL Standard Wordpress editor Categories from ELLI Plugin visualizes blog categories, mirroring the ELLI spider
ENQUIRYBLOGGER: COHORT DASHBOARD
LEARNINGEMERGENCE. NET more on analytics for learning to learn and authentic enquiry
ANALYTICS FOR LEARNING CONVERSATIONS
DISCOURSE LEARNING ANALYTICS Effective learning conversations display some typical characteristics which learners can and should be helped to master Learners’ written, online conversations can be analysed computationally for patterns signifying weaker and stronger forms of contribution
SOCIO-CULTURAL DISCOURSE ANALYSIS (MERCER ET AL, OU) • Disputational talk, characterised by disagreement and individualised decision making. • Cumulative talk, in which speakers build positively but uncritically on what the others have said. • Exploratory talk, in which partners engage critically but constructively with each other's ideas. Mercer, N. (2004). Sociocultural discourse analysis: analysing classroom talk as a social mode of thinking. Journal of Applied Linguistics, 1(2), 137 -168.
SOCIO-CULTURAL DISCOURSE ANALYSIS (MERCER ET AL, OU) • Exploratory talk, in which partners engage critically but constructively with each other's ideas. • Statements and suggestions are offered for joint consideration. • These may be challenged and counter-challenged, but challenges are justified and alternative hypotheses are offered. • Partners all actively participate and opinions are sought and considered before decisions are jointly made. • Compared with the other two types, in Exploratory talk knowledge is made more publicly accountable and reasoning is more visible in the talk. Mercer, N. (2004). Sociocultural discourse analysis: analysing classroom talk as a social mode of thinking. Journal of Applied Linguistics, 1(2), 137 -168.
ANALYTICS FOR IDENTIFYING EXPLORATORY TALK Elluminate sessions can be very long – lasting for hours or even covering days of a conference It would be useful if we could identify where quality learning conversations seem to be taking place, so we can recommend those sessions, and not have to sit through online chat about virtual biscuits Ferguson, R. and Buckingham Shum, S. Learning analytics to identify exploratory dialogue within synchronous text chat. 1 st International Conference on Learning Analytics & Knowledge (Banff, Canada, 27 Mar-1 Apr, 2011)
KMI’S COHERE: A WEB DELIBERATION PLATFORM ENABLING SEMANTIC SOCIAL NETWORK AND DISCOURSE NETWORK ANALYTICS Rebecca is playing the role of broker, connecting 2 peers’ contributions in meaningful ways De Liddo, A. , Buckingham Shum, S. , Quinto, I. , Bachler, M. and Cannavacciuolo, L. Discourse-centric learning analytics. 1 st International Conference on Learning Analytics & Knowledge (Banff, 27 Mar-1 Apr, 2011)
DISCOURSE ANALYSIS Xerox’s parser can detect the presence of ‘knowledge-level’ moves in text: BACKGROUND KNOWLEDGE: NOVELTY: Recent studies indicate … … the previously proposed … . . . new insights provide direct evidence. . . … little is known … … role … has been elusive. . . we suggest a new. . . approach. . . … is universally accepted. . . results define a novel role. . . CONRASTING IDEAS: SIGNIFICANCE: SUMMARIZING: … unorthodox view resolves … paradoxes … studies. . . have provided important advances The goal of this study. . . In contrast with previous hypotheses. . . Knowledge. . . is crucial for. . . understanding. . . inconsistent with past findings. . . OPEN QUESTION: Current data is insufficient … Here, we show. . . Altogether, our results. . . indicate valuable information. . . from studies GENERALIZING: SURPRISE: . . . emerging as a promising approach We have recently observed. . . surprisingly Our understanding. . . has grown exponentially. . . growing recognition of the importance. . . We have identified. . . unusual The recent discovery. . . suggests intriguing roles Ágnes Sándor & OLnet Project: http: //olnet. org/node/512 De Liddo, A. , Sa ndor, A. and Buckingham Shum, S. (In Press). Contested Collective Intelligence: Rationale, Technologies, and a Human-Machine Annotation Study. Computer Supported Cooperative Work Journal
NEXT STEPS SOCIAL LEARNING ANALYTICS: Develop this framework to integrate social, discourse, disposition and other process-centric analytics DISPOSITION ANALYTICS: Extend the capabilities of the ELLI ‘learning power’ platform using real-time analytics data from online learner activity DISCOURSE ANALYTICS: human+machine annotation of written discourse and argument maps
IN MORE DETAIL… Social Learning Analytics § Buckingham Shum, S. and Ferguson, R. (2011). Social Learning Analytics. Available as: Technical Report KMI-11 -01, Knowledge Media Institute, The Open University, UK. http: //kmi. open. ac. uk/publications/techreport/kmi-11 -01 Discourse Analytics § § § De Liddo, A. , Buckingham Shum, S. , Quinto, I. , Bachler, M. and Cannavacciuolo, L. (2011). Discourse-Centric Learning Analytics. 1 st International Conference on Learning Analytics & Knowledge (Banff, 27 Mar-1 Apr, 2011). Eprint: http: //oro. open. ac. uk/25829 Ferguson, R. and Buckingham Shum, S. (2011). Learning Analytics to Identify Exploratory Dialogue Within Synchronous Text Chat. 1 st International Conference on Learning Analytics & Knowledge (Banff, Canada, 27 Mar-1 Apr, 2011). Eprint: http: //oro. open. ac. uk/28955 De Liddo, A. , Sa ndor, A. and Buckingham Shum, S. (2012, In Press). Contested Collective Intelligence: Rationale, Technologies, and a Human-Machine Annotation Study. Computer Supported Cooperative Work. DOI: 10. 1007/s 10606 -011 -9155 -x. http: //www. springerlink. com/content/23 n 1408 l 9 g 06 v 062 Disposition Analytics § § Ferguson, R. , Buckingham Shum, S. and Deakin Crick, R. (2011). Enquiry. Blogger: Using Widgets to Support Awareness and Reflection in a PLE Setting. 1 st Workshop on Awareness and Reflection in Personal Learning Environments, PLE Conference 2011, 11 -13 July 2011, Southampton, UK. Eprint: http: //oro. open. ac. uk/30598 Buckingham Shum, S. and Deakin Crick, R (2012). Learning Dispositions and Transferable Competencies: Pedagogy, Modelling, and Learning Analytics. Accepted to 2 nd International Conference on Learning Analytics & Knowledge (Vancouver, 29 Apr – 2 May, 2012). Working draft under revision: http: //projects. kmi. open. ac. uk/hyperdiscourse/docs/SBS-RDC-review. pdf
SUMMARY Discipline knowledge Educator owns and manages a single dataset Educator owns and manages multiple datasets Learners add their own datasets Hybrid closed + open analytics C 21 Learning Capacities Focus of most LA effort More LA effort needed mastery of core knowledge and skills in training is vital, but no longer sufficient We need analytics tuned to generic capacities which equip learners for novel challenges
- Slides: 59