Mass Data Processing Technology on Large Scale Clusters

  • Slides: 9
Download presentation
Mass Data Processing Technology on Large Scale Clusters Summer, 2007, Tsinghua University All course

Mass Data Processing Technology on Large Scale Clusters Summer, 2007, Tsinghua University All course material (slides, labs, etc) is licensed under the Creative Commons Attribution 2. 5 License. Many thanks to Aaron Kimball & Sierra Michels-Slettvet for their original version 1

Staff Kang Chen – Instructor • ck 99@mails. tsinghua. edu. cn Dahai Li –

Staff Kang Chen – Instructor • ck 99@mails. tsinghua. edu. cn Dahai Li – Project Lead • dahaili@google. com Kai Wang – Project Lead • kaiwang@google. com Yubing Yin – Teaching Assistant • burningice 9@gmail. com 2

Thanks to Kuang Chen • Without his help, this course could not happen. Christophe

Thanks to Kuang Chen • Without his help, this course could not happen. Christophe Bisciglia • He is the man who brings the course here. Albert Wong • Help us to setup the cluster and preparing the lab material. Hannah Tang • Help us preparing various materials including the lectures, homework, discussions etc. 3

Goals and Expectations � 5 Lectures �Readings �Homework + discussion � 4 Labs �Final

Goals and Expectations � 5 Lectures �Readings �Homework + discussion � 4 Labs �Final project proposal �Project reports �Project review @ Google 4

Deliverable �Homework � 3 Lab reports �Final project proposal �Final project report �Final project

Deliverable �Homework � 3 Lab reports �Final project proposal �Final project report �Final project presentation 5

Key Information Key: Lecture 8 -203 Lab 9 -225 Homework Hours: Grading: 8: 50

Key Information Key: Lecture 8 -203 Lab 9 -225 Homework Hours: Grading: 8: 50 Lab: 12: 00 AM 2: 00 HW: 5: 00 PM Final Project: 30% 20% 50% 6

Timetable week of Location M Tu W Th F 13 -Aug Lecture lecture 1

Timetable week of Location M Tu W Th F 13 -Aug Lecture lecture 1 lecture 2 lecture 3 lecture 4 discussion Lab lab 0 lab 1 lab 2 lab 0/1 due HW readings hw due Lecture lecture 5 FP discussion Lab lab 3 FP proposal lab 2/3 due HW readings 27 -Aug lecture/lab FP begin 3 -Sep lecture/lab guest lecture 10 -Sep lecture/lab guest lecture 20 -Aug Google * HW: homework FP: final project readings hw due guest lecture FP review 7

Lectures and Labs Description Lecture 1 Introduction to Networking and Distributed Systems Lecture 2

Lectures and Labs Description Lecture 1 Introduction to Networking and Distributed Systems Lecture 2 Map/Reduce Theory and Implementation Lecture 3 Distributed File System and the Google File System Lecture 4 Distributed Graph Algorithms and Page. Rank Lecture 5 Clustering – an Overview and Sample Map. Reduce Implementation Lab 0 Setup Lab 1 Simple Inverted Index Lab 2 Page. Rank Lab 3 Clustering 8

And… Let us start. 9

And… Let us start. 9