Programming For Big Data Darren Redmond Programming Languages

Programming For Big Data Darren Redmond

• Programming Languages • • • Python R Java C, C++ Ruby

Why, Why • History of Python • Guido van Rossum – 1989 was bored at Christmas • Why Python • • Easy to learn Powerful Data structures Modular Embedding Map Reduce / Lambda / Yield Interactive Shell http: //www. python-course. eu/index. php • The End Game • http: //www. michael-noll. com/tutorials/writing-an-hadoop-mapreduce-program-in-python/

Interactive Interpreter • python • >>> print “Hello World” • easier? • • “Hello World” 12 / 7 12. 0 / 7 3 + 2 * 4 # 11 _ # the most recent value _ * 3 # 33 Ctrl-D

Execute Script • Multiple ways to execute a script – below are 4 ways for a script called script-name. py: • From command prompt - uncompiled • python script-name. py • From python interpreter – ensure to start python from directory of script for now. • import py_compile • py_compile(‘script-name. py’) • Compile from command line • python –m py_compile script-name. py • python –m compileall. • Py and Pyc files available

Indentation • Scope achieved through indentation – not brackets • Auto creation and interpretation of variables • i = 42 • i = i + 1 # 43 • print id(i) • Types • Numbers -> integers, long integers, floating point numbers, complex • Strings -> functions – concat (+), slicing [2: 4] • Operators – input, raw_input • Casting to list etc…

Conditional • if • else • max = (a > b) ? a : b; This is an abbreviation for the following C code • if (a > b) • max=a; • else • max=b; • C programmers have to get used to a different notation in Python • max = a if (a > b) else b;

Looping • #!/usr/bin/env python • n = 100 • sum = 0 • i=1 • while i <= n: • sum = sum + i • i=i+1 • print "Sum of 1 until %d: %d" % (n, sum)

Bibliography • Python for Data Analysis • Data Wrangling with Pandas, Num. Py, and Ipython • Wes Mc. Kinney, O’Reilly, 2012 • Programming Python, 4 th Edition • Powerful Object-Oriented Programming • Mark Lutz, O’Reilly, 2010 • Agile Data Science – Building Data Analytics with Hadoop • Russell Jurney, O’Reilly, 2013 • Functional Python Programming • Steven Lott, Pakt Publishing, 2015

Practice, Practice • From Lecture 1 you should be able to write a python script file to do calculations and print them to the screen • Write a program to print ‘Hello World’ to the screen • Write a program to sum the first 100 numbers • Write a program to multiply the first 10 numbers

Summary • Programming Languages for Big Data • Why Python • Hello World • Executing a Script • Indentation • Conditional Programming • Looping • Bibliography • Practice
- Slides: 11