Data Analytics CS 40003 Dr Debasis Samanta Associate
Data Analytics (CS 40003) Dr. Debasis Samanta Associate Professor Department of Computer Science & Engineering
Today’s discussion… • Semester organization • Syllabus • Course objective • Course plan • Reference and study materials • Course web page • Contact details
…Course organization • Title: Data Analytics • Code: CS 40003 • Credit: 3 -0 -0 = 3 • Slot: F • Timing Wednesday: 10: 00 -10: 55 Thursday: 09: 00 -09: 55 Friday: 11: 00 -11: 55 (+12: 55) • Venue: V 2, Vikramshila Complex
…Course objective This course will cover fundamental algorithms and techniques used in Data Analytics. The statistical foundations will be covered first, followed by various machine learning and data mining algorithms. Technological aspects like data management, scalable computation and visualization will also be covered. In summary, this course will provide exposure to theory as well as practical systems and software used in data analytics. After completing this course, you will learn how to: • • • Find a meaningful pattern in data Graphically interpret data Implement the analytic algorithms Handle large scale analytics projects from various domains Develop intelligent decision support systems
…Syllabus • Data definition • Concept of data • Data vs. Information • Data categorization • Descriptive Statistics • Measure of central tendency • Measure of location of dispersion • Basic Analysis Techniques • • Statistical hypothesis generation and testing Chi-Square test t-Test, Analysis of variance, Correlation analysis Maximum likelihood test
…Syllabus • Data Analysis Techniques • • Regression analysis Classification techniques Clustering techniques Association rule analysis • Case Studies and Projects • • Understanding few business scenarios Feature engineering and visualization Scalable and parallel computing with Hadoop and Map. Reduce Sensitivity analysis
…Study materials 1. Probability & Statistics for Engineers & Scientists (9 th Edn. ), Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers and Keying Ye, Prentice Hall Inc. 2. The Elements of Statistical Learning, Data Mining, Inference, and Prediction (2 nd Edn. ), Trevor Hastie Robert Tibshirani, Jerome Friedman, Springer, 2014 3. An Introduction to Statistical Learning: with Applications in R, G. James, D. Witten, T Hastie, and R. Tibshirani, Springer, 2013 4. Software for Data Analysis: Programming with R (Statistics and Computing), John M. Chambers, Springer, 2012 5. Mining Massive Data Sets, A. Rajaraman and J. Ullman, Cambridge University Press, 2012. 6. Advances in Complex Data Modeling and Computational Methods in Statistics, Anna Maria Paganoni and Piercesare Secchi, Springer, 2013
…Study materials 7. Data Mining and Analysis, Mohammed J. Zaki, Wagner Meira, Cambridge University Press, 2012 8. Hadoop: The Definitive Guide (2 nd Edn. ) by Tom White, O-Reilly, 2014 9. Map. Reduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems, Donald Miner, Adam Shook, O'Reilly, 2014 10. Beginning R: The Statistical Programming Language, Mark Gardener, Wiley, 2013 Lecture slides and other materials can be had at http: //cse. iitkgp. ac. in/~dsamanta/
…Evaluation plan • Minimum attendance required: 75% of the total classes • Mid-Semester evaluation: 30% • End-Semester evaluation: 40% • Project-based evaluation: 30% (in four phases) Note: Minimum attendance and presence in all evaluations are must (other than some medical or emergency ground. No compensatory test or extended submission).
Automated Attendance Marking …
Device Requirement • Android Lollipop - 5. 0+ • Unknown Sources – To install non-google play store application • Internet connectivity – Use VPN for proxy enabled Wi-Fi connection • GPS enabled
Installation and Setup • Download the application • Install the application – Turn install from “unknown sources” on • Open the application • Click Sign Up button
Click the Student button
Enter your detail… Enter details carefully Note: Roll number is your UID. Caution: Information once entered cannot be modified. Please review before submitting. • Password should be at least 8 characters. • Press Sign Up button • Wait until the process completes
Sign In • Enter your registered email address • Enter password
Enroll Course • Enter Course details – Course ID: CS 40003 – Teacher ID: DSM – Semester: Autumn – Slot: F • Click Request Approval
Mark your attendance Wait till announcement and note down Unique Id • Sign in • Click “Attendance” • • • Course ID: CS 40003 Semester: Autumn Slot: F Enter Unique Code Click “Attend Class” button
…Submissions of projects and assignments • Moodle Course Management System https: //10. 5. 18. 110/moodle/ • Steps to enroll to the course at Moodle – Create your account (if it is not created earlier) • • User Id: <Roll Number> Password: Email account: Enrolment Key : CS 40003 • Verification your account, check the registered mail box – Login to Moodle with “User Id” and “Password” – Select the course “Data Analytics” from list of courses at the link “My Course”
…Doubt clearance and discussions • Please use “Discussion Forum” at the link “Moodle” in the course web page at http: //cse. iitkgp. ac. in/~dsamanta/courses/da/index. html
While you are in the class…
Happy Learning!
- Slides: 21