Mining Massive Datasets Course Overview Mining Massive Datasets

  • Slides: 10
Download presentation
Mining Massive Datasets Course Overview Mining Massive Datasets Wu-Jun Li Department of Computer Science

Mining Massive Datasets Course Overview Mining Massive Datasets Wu-Jun Li Department of Computer Science and Engineering Shanghai Jiao Tong University Lecture 0: Course Overview 1

Mining Massive Datasets Course Overview General Information § Instructor: Wu-Jun Li (李武军) § §

Mining Massive Datasets Course Overview General Information § Instructor: Wu-Jun Li (李武军) § § Email: liwujun@cs. sjtu. edu. cn Homepage: http: //www. cs. sjtu. edu. cn/~liwujun Office: Rm 3 -537, SEIEE Building Office Hours: Tue 14: 00 - 15: 00 § Course web site: http: //www. cs. sjtu. edu. cn/~liwujun/course/mmds. html § Teaching Assistant: Zhi-Qin Yu (余志琴) § Email: xiaoyu 199175@gmail. com § Office Hours: TBD; Rm 3 -503, SEIEE Building § Time and Venue: Mon 14: 00 – 15: 40; Wed 10: 00 - 11: 40; Fri 08: 00 09: 40 ; Rm 105, Dong Shang Yuan (东上院 105) 2

Mining Massive Datasets Course Overview Textbook § Anand Rajaraman and Jeffrey D. Ullman. Mining

Mining Massive Datasets Course Overview Textbook § Anand Rajaraman and Jeffrey D. Ullman. Mining of Massive Datasets. Cambridge University Press, 2011. You can download it from the book website (http: //i. stanford. edu/~ullman/mmds. html). 3

Mining Massive Datasets Course Overview Reference Books § Jiawei Han, and Micheline Kamber. Data

Mining Massive Datasets Course Overview Reference Books § Jiawei Han, and Micheline Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, Second Edition, 2006. (The English reprint edition can be bought through China-Pub. ) § Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006. § Chuck Lam. Hadoop in Action. Manning Publications, First Edition, 2010. § 周憬宇,李武军,过敏意. 《飞天开放平台编程指 南-阿里云计算的实践》. 电子 业出版社,2013 年 3月. 4

Mining Massive Datasets Course Overview Course Topics § Data-Intensive Scalable Computing (DISC) § Cloud

Mining Massive Datasets Course Overview Course Topics § Data-Intensive Scalable Computing (DISC) § Cloud Computing § Map. Reduce and Hadoop § Data Mining and Machine Learning § Basics: supervised learning; unsupervised learning; matrix factorization § Large-scale (distributed) implementations with Hadoop § Data-Intensive Applications § Search, link analysis, recommender systems, mining data streams, advertising on Web 5

Mining Massive Datasets Course Overview Prerequisites § Data structure § Design and analysis of

Mining Massive Datasets Course Overview Prerequisites § Data structure § Design and analysis of algorithms § Linear algebra § Probability theory § Programming languages : Java, c++ 6

Mining Massive Datasets Course Overview Grading Scheme § Class attendance (10%) § Homework (20%)

Mining Massive Datasets Course Overview Grading Scheme § Class attendance (10%) § Homework (20%) § Exam (40%): Final (40%) § Project (30%) § 3 students / group 7

Mining Massive Datasets Course Overview Late Assignments § Assignments turned in late will be

Mining Massive Datasets Course Overview Late Assignments § Assignments turned in late will be penalized 20% per late day 8

Mining Massive Datasets Course Overview Academic Honor Code § Honesty and integrity are central

Mining Massive Datasets Course Overview Academic Honor Code § Honesty and integrity are central to the academic work. § All your submitted assignments must be entirely your own (or your own group's). § Any student found cheating or performing plagiarism will receive a final score of zero for this course. 9

Mining Massive Datasets Course Overview Questions? 10

Mining Massive Datasets Course Overview Questions? 10