Hadoop Map Reduce Framework Mr Sriram Email hadoopsriramagmail
- Slides: 60
Hadoop Map. Reduce Framework Mr. Sriram Email: hadoopsrirama@gmail. com
Objectives v Map. Reduce Concepts v Map. Reduce Job v Map. Reduce Data Flow v Analyze different use cases where Map. Reduce is used v Differentiate between Traditional way and Map. Reduce way v Learn about Hadoop 2. X Map. Reduce architecture and components v Understand execution flow of YARN Map. Reduce application v Implement basic Map. Reduce concepts v Run a Map. Reduce Program v Understand Input splits concepts in Map. Reduce v Understand Map. Reduce Job Submission Flow v Implement Combiner and Partitioner in Map. Reduce
Map. Reduce Concepts • • • Introduction to Map Reduce Functional Programming Concepts Mapper Reducer Driver
Introduction to Map Reduce Hadoop map/Reduce is a software framework for easily writing application which process vast amount of data inparallel on large clusters of commodity hardware in a reliable, fault-tolerant manner. A Map/Reduce job usually splits the input data-set into independent chunks which are processed by map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and output of the jobs are stored in a file-system. The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks. Map/Reduce framework and HDFS are running on same set of nodes and it allows the framework to effectively schedule tasks on one node where data is already present, resulting in very high aggregate bandwidth across the cluster. This framework consists of a single master Job Tracker/ Resource Manager and one slave Task Tracker / ode Manager per cluster-node.
Functional Programming Concepts
Mapper
Reducer
Driver
Where Map. Reduce is used?
Traditional Way
Map. Reduce Way
Why Map. Reduce?
Solving the problem with Map. Reduce
Hadoop 2. X Map. Reduce Architecture
Hadoop 2. X Map. Reduce Components
Anatomy of a Map. Reduce Program
Map. Reduce Paradigm
Physical Flow of Map. Reduce Program
Physical Flow of Map. Reduce Program
Life Cycle of Map. Reduce Job Map function Reduce function Run this program as a Map. Reduce job
Input Splits
Relation between input splits and HDFS Blocks
Map. Reduce Job Submission Flow
Overview of Map. Reduce
Combiners
Combiner
Partitioner - Redirecting output from Mapper
Revisit – De Identification Architecture
Demo 1– Word Count Program Demo of Word Count Data Program
Demo 2– Word Size Word Count Program Demo of Word Size Word Count Data Program
Demo 3– Weather Data Program Demo of Weather Data Program
Demo 4– Patent Data Program Demo of Patent Data Program
Demo 5– Max Temp Data Program Demo of Max Temp Data Program
Demo 6– Average Salary Program Demo of Average Salary Program
Demo 7– De. Identify Healthcare Program Demo of De. Identify Healthcare Program
Demo 8– Music Track Program Demo of Music Track Program
Demo 9– Call Center Data Analytics Program Demo of Callcenter Data Analytics Program
Map. Reduce Job • • Introduction Job Submission Job Initialization Task Assignment Task Execution Progress and Status Updates Job Completion
Introduction
Job Submission
Job initialization
Job Assignment
Job Execution
Progress Measure
Progress and Status Updates
Progress and Status Updates. .
Job Completion
• Input Files Map. Reduce Data Flow • • • Input Format Input Splits Record Reader Mapper Partition and Shuffle Sort Reduce Output Format Record Writer Output Files
Map. Reduce Data Flow Diagram
Input Files
Input Format
Input Splits
Record Reader
Mapper
Partition and Shuffle
Sort
Reduce
Output Format
Record Writer
Thank You !!!!!!
- File based data structures in hadoop
- Hadoop is open source
- Ashok sriram md
- Dr sriram rajagopal cardiologist
- A bad robot by readworks characters
- Sriram vishwanath
- Sriram rajamani
- Sriram lakshmanan
- Sriram rajagopal edgeq
- Sriram rajamani
- Informal and formal email
- Map reduce join
- Mapreduce types and formats
- Map reduce word count
- Google map reduce
- Java map reduce
- Map reduce algorithm
- Document
- Map reduce paper
- Map-reduce
- Map reduce program
- Map-reduce
- Sherpamap
- Mapreduce: simplified data processing on large clusters
- Java map reduce
- Lisp map reduce
- Google map reduce
- Dispositional framework vs regulatory framework
- Theoretical framework example
- Theoretical framework example
- Theoretical framework
- Dispositional framework vs regulatory framework
- Theoretical framework
- Hadoop yarn
- Isilon nitro
- Hadoop matrix multiplication
- Hadoop assignment help
- Supercloud hadoop
- Cern dfs
- Intro to hadoop
- Hadoop virtualbox
- Hdfs latency
- Jaql hadoop
- Hadoop distributed file system
- Hadoop's parallel world
- Hadoop deep learning
- Hadoop streaming python
- Facebook hadoop
- Input formats in hadoop
- Fs mkdir
- Brief history of hadoop
- Evolution of hadoop
- Distributed computing hadoop
- Hadoop master slave architecture
- What is the sequence of installations on rhipe
- Hadoop combiner
- Hadoop fault tolerance
- Visiomap
- Rails hadoop
- All the following accurately describe hadoop except
- Scale up and scale out in hadoop