Ideas for training Spark andor Hadoop 2017 Hadoop

  • Slides: 9
Download presentation
Ideas for training – Spark and/or Hadoop 2017 Hadoop User Forum, December 2016 Luca

Ideas for training – Spark and/or Hadoop 2017 Hadoop User Forum, December 2016 Luca Canali, IT-DB 1

Training Offer for Spark/Hadoop • Opportunity for change/new ideas: • • Contacted by IT

Training Offer for Spark/Hadoop • Opportunity for change/new ideas: • • Contacted by IT Training Officer (currently Michal Kwiatek) and technical training (Maria Fiascaris) to propose ideas and content for courses in the area of Spark/Hadoop One thing that has gone wrong recently: • • The old Hadoop for developer course, still in the catalog Program unchanged since 3 -4 years

Needs • Projects started/starting on Hadoop/Spark: Need/demand for training • Mostly training for developers

Needs • Projects started/starting on Hadoop/Spark: Need/demand for training • Mostly training for developers What level? • • Many are just starting: beginners level? Also experienced level

Challenges • Define the program of the course/courses • What topics the community needs/wants

Challenges • Define the program of the course/courses • What topics the community needs/wants training on? • • • Formats • • General course on Hadoop ecosystem? Focused on one component, for example Spark? Tutorials, MOOC, self-training, classroom training How to find good teachers? • Trusted companies or freelancers?

Overview of Available Components (Dec 2016) Kafka Streaming/In gestion

Overview of Available Components (Dec 2016) Kafka Streaming/In gestion

Apache Spark • • Spark evolution from map reduce ideas Powerful engine, in particular

Apache Spark • • Spark evolution from map reduce ideas Powerful engine, in particular for data science and streaming • Aims to be a “unified engine for big data processing” 6

Ideas for Beginners’ Training • Deliver training for beginners as a course • •

Ideas for Beginners’ Training • Deliver training for beginners as a course • • Use IT engineers to deliver a basic course ½ day course? Based on topics and presentation of tutorials 2016 but without the hands-on • See also: https: //indico. cern. ch/event/546000/ Benefit • • People interested to get a general overview Optimize classroom training, by skipping basic introduction

Ideas for Classroom Training • • This is expensive, so we want to get

Ideas for Classroom Training • • This is expensive, so we want to get it right Program for overview of Hadoop ecosystem? • • Example: from Cloudera course catalog “developer training for Spark and Hadoop” http: //www. cloudera. com/content/dam/www/static/document s/datasheets/developer-training-for-spark-and-hadoop. pdf Program targeted for Spark development? • • Which part(s)? core, SQL, streaming , machine learning Language? Python/Scala

Discussion • Notes added from discussion: A course on Spark, general overview for developers,

Discussion • Notes added from discussion: A course on Spark, general overview for developers, similar to the course in Fall 2015 is of interest • • • In particular the material from Databricks was found to be of good quality It is important that the material is up-to-date as technology evolves quickly Important to find a good teacher A course that covers end-to-end workflows also of interest for example for the experiments • • • Python popular with experiments