Map Reduce Map Reduce Map Reduce Log Processing

  • Slides: 44
Download presentation

Map / Reduce

Map / Reduce

Map / Reduce Το Map / Reduce χρησιμοποιείται κυρίως για: ◦ Log Processing. ◦

Map / Reduce Το Map / Reduce χρησιμοποιείται κυρίως για: ◦ Log Processing. ◦ Web Index Building. ◦ Data mining και Machine Learning.

Map / Reduce

Map / Reduce

Configuration (hadoopsite. xml) <configuration> <property> <name>fs. default. name</name> <value>localhost: 9000</value> </property> <name>mapred. job. tracker</name>

Configuration (hadoopsite. xml) <configuration> <property> <name>fs. default. name</name> <value>localhost: 9000</value> </property> <name>mapred. job. tracker</name> <value>localhost: 9001</value> </property> <name>dfs. replication</name> <value>1</value> </property> </configuration>

Configuration {path_prefix}/bin/hadoop namenode – format ◦ Format καινούριου DFS στον namenode {path_prefix}/bin/start-all. sh ◦

Configuration {path_prefix}/bin/hadoop namenode – format ◦ Format καινούριου DFS στον namenode {path_prefix}/bin/start-all. sh ◦ Έναρξη του hadoop daemon … {path_prefix}/bin/stop-all. sh ◦ Τερματισμού του daemon

Random. Text. Writer Εκτέλεση: {path_prefix}/bin/hadoop jar hadoop${version}-examples. jar randomtextwriter <out-dir> [conf_file] Submit του grep

Random. Text. Writer Εκτέλεση: {path_prefix}/bin/hadoop jar hadoop${version}-examples. jar randomtextwriter <out-dir> [conf_file] Submit του grep job στον Job. Tracker Επικοινωνία του Job. Tracker με τον Name. Node για εξεύρεση των κατάλληλων Task. Trackers Ο Job. Tracker κάνει submit την δουλειά στους διάφορους Task. Trackers

Εκτέλεση Random. Text. Writer [orestis: hadoop] bin/hadoop jar hadoop-*-examples. jar randomwriter rand_output rtw. conf

Εκτέλεση Random. Text. Writer [orestis: hadoop] bin/hadoop jar hadoop-*-examples. jar randomwriter rand_output rtw. conf Running 10 maps. Job started: Tue Apr 07 19: 00: 48 EEST 2009 09/04/07 19: 00: 49 INFO mapred. Job. Client: Running job: job_200904071847_0001 09/04/07 19: 00: 50 INFO mapred. Job. Client: map 0% reduce 0% 09/04/07 19: 02: 41 INFO mapred. Job. Client: map 20% reduce 0% 09/04/07 19: 04: 34 INFO mapred. Job. Client: map 30% reduce 0% 09/04/07 19: 04: 35 INFO mapred. Job. Client: map 40% reduce 0% 09/04/07 19: 06: 21 INFO mapred. Job. Client: map 50% reduce 0% 09/04/07 19: 06: 24 INFO mapred. Job. Client: map 60% reduce 0% 09/04/07 19: 08: 07 INFO mapred. Job. Client: map 70% reduce 0% 09/04/07 19: 08: 09 INFO mapred. Job. Client: map 80% reduce 0% 09/04/07 19: 09: 56 INFO mapred. Job. Client: map 90% reduce 0%. . Job ended: Tue Apr 07 19: 09: 57 EEST 2009 The job took 548 seconds.

Εκτέλεση Random. Text. Writer [orestis: hadoop] bin/hadoop dfs -ls Found 1 items drwxr-xr-x -

Εκτέλεση Random. Text. Writer [orestis: hadoop] bin/hadoop dfs -ls Found 1 items drwxr-xr-x - orestis supergroup 0 2009 -04 -07 19: 09 /user/orestis/rand_output [orestis: hadoop] bin/hadoop dfs -ls rand_output/ Found 11 items drwxr-xr-x - orestis supergroup 0 2009 -04 -07 19: 00 /user/orestis/rand_output/_logs -rw-r--r-- 1 orestis supergroup 1077289470 2009 -04 -07 19: 00 /user/orestis/rand_output/part 00000 -rw-r--r-- 1 orestis supergroup 1077275793 2009 -04 -07 19: 00 /user/orestis/rand_output/part 00001 -rw-r--r-- 1 orestis supergroup 1077283821 2009 -04 -07 19: 02 /user/orestis/rand_output/part 00002 -rw-r--r-- 1 orestis supergroup 1077298379 2009 -04 -07 19: 02 /user/orestis/rand_output/part 00003 -rw-r--r-- 1 orestis supergroup 1077292822 2009 -04 -07 19: 04 /user/orestis/rand_output/part 00004 -rw-r--r-- 1 orestis supergroup 1077286019 2009 -04 -07 19: 04 /user/orestis/rand_output/part 00005 -rw-r--r-- 1 orestis supergroup 1077287527 2009 -04 -07 19: 06 /user/orestis/rand_output/part 00006 -rw-r--r-- 1 orestis supergroup 1077287446 2009 -04 -07 19: 06 /user/orestis/rand_output/part 00007

Εκτέλεση Grep [orestis: hadoop] bin/hadoop jar hadoop-*-examples. jar grep rand_output grep_output 'here' 09/04/07 19:

Εκτέλεση Grep [orestis: hadoop] bin/hadoop jar hadoop-*-examples. jar grep rand_output grep_output 'here' 09/04/07 19: 12: 36 INFO mapred. File. Input. Format: Total input paths to process : 10 09/04/07 19: 12: 37 INFO mapred. Job. Client: Running job: job_200904071847_0002 09/04/07 19: 12: 38 INFO mapred. Job. Client: map 0% reduce 0% 09/04/07 19: 12: 52 INFO mapred. Job. Client: map 1% reduce 0%. . . 09/04/07 19: 31: 46 INFO mapred. Job. Client: HDFS bytes read=107 09/04/07 19: 31: 46 INFO mapred. Job. Client: Local bytes read=29 [orestis: hadoop] bin/hadoop dfs -ls Found 2 items drwxr-xr-x - orestis supergroup 0 2009 -04 -07 19: 31 /user/orestis/grep_output drwxr-xr-x - orestis supergroup 0 2009 -04 -07 19: 09 /user/orestis/rand_output [orestis: hadoop] bin/hadoop dfs -ls grep_output Found 2 items drwxr-xr-x - orestis supergroup -rw-r--r-- 1 orestis supergroup 0 2009 -04 -07 19: 31 /user/orestis/grep_output/_logs 7 2009 -04 -07 19: 31 /user/orestis/grep_output/part-00000 [orestis: hadoop] bin/hadoop dfs -cat grep_output/part-00000 4 here

Βιβλιογραφία http: //wiki. apache. org/hadoop/ http: //en. wikipedia. org/wiki/Hadoop http: //en. wikipedia. org/wiki/Map. Reduc

Βιβλιογραφία http: //wiki. apache. org/hadoop/ http: //en. wikipedia. org/wiki/Hadoop http: //en. wikipedia. org/wiki/Map. Reduc e