CS 598 CXZ CS 510 Advanced Topics in
- Slides: 11
CS 598 CXZ (CS 510) Advanced Topics in Information Retrieval (Fall 2016) Instructor: Cheng. Xiang (“Cheng”) Zhai Teaching Assistants: Rongda Zhu (full time) Chase Geigle (part time) Department of Computer Science University of Illinois, Urbana-Champaign 1
Course Goal • • Advanced (graduate-level) introduction to the field of information retrieval (IR), broadly including Text mining Goal – Provide an in-depth introduction to advanced IR algorithms based on statistical language models – Provide an opportunity for students to explore frontier topics via course projects (customized toward the interests of students) – Give students enough training for doing research in IR or applying advanced IR techniques to applications – Tangible outcome: research paper, open source code, and application system 2
Prerequisites • Basic concepts in CS 410 Text Info Systems • Programming skills: CS 225 or equivalent level • A good knowledge of basic probability and statistics • Knowledge of one or more of the following areas is a plus, but not required: Information Retrieval, Machine Learning, Data Mining, Natural Language Processing • Contact the instructor if you aren’t sure 3
• Format Mixture of – Lectures by instructor (about 50%) • • • – Presentations by students (project-based) (about 50%) Assignments: ensure solid mastery of concepts and skills of implementation – Written assignments + programming Midterm (75 min, in class): mostly to verify your mastery of main concepts and algorithms Course project: multiple options – In-depth study of a topic publication/submission – Implementation of a major algorithm open source – Development of a novel application useful application 4
Office Hours • Instructor: – Tue. 1: 30 pm-2: 30 pm; Fri. 11: 00 am-12 pm – 2116 SC • TA (0207 SC? ) – Rongda Zhu: Mon & Thur, 10 -11 am – Chase Geigle: TBD to accommodate online students • Email us at any time 5
Grading • Assignments: 30% • Midterm: 30% • Project: 40% 6
• • Schedule Part I: Background, overview of IR research (lectures by instructors): relevant math Part II: IR: frameworks and models (lectures by instructors) – Covering the major algorithms for optimizing ranking • Part III: Text analysis: topic models & neural language models (lectures by instructors) – Covering topic models and word embedding for text analysis • Part IV: Frontier topics and applications (project-based workshop; presentations by students + discussions) – Covering project-related frontier topics, system implementation, and applications 7
Your Work Load Aug Sept Aug 24 Readings Oct Nov Thanksgiving Dec 6 Last Day of Instruction Assignments Midterm Paper/project Presentation &discussion Project 8
Reference Book Cheng. Xiang Zhai, Chase Geigle, Statistical Language Models for Text Data Retrieval and Analysis, forthcoming. Draft will be available online 9
Other readings: mostly research papers, survey articles, and book chapters – Synthesis Lectures Digital Library: http: //www. morganclaypool. com/ – Foundations & Trends in IR: http: //www. nowpublishers. com/ir/ – Recent papers from SIGIR, CIKM, WWW, WSDM, KDD, ACL, ICML, … 10
Questions? Course website: http: //times. cs. uiuc. edu/course/598 f 16 Piazza: https: //piazza. com/class/irvce 5 pdgfz 71 d 11