Programming For Big Data Darren Redmond Programming Languages
Programming For Big Data Darren Redmond
• Programming Languages • • • Python R Java C, C++ Ruby
Why, Why • History of Python • Guido van Rossum – 1989 was bored at Christmas • Why Python • • Easy to learn Powerful Data structures Modular Embedding Map Reduce / Lambda / Yield Interactive Shell http: //www. python-course. eu/index. php • The End Game • http: //www. michael-noll. com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
Interactive Interpreter • python • >>> print “Hello World” • easier? • • “Hello World” 12 / 7 12. 0 / 7 3 + 2 * 4 # 11 _ # the most recent value _ * 3 # 33 Ctrl-D
Execute Script • Multiple ways to execute a script – below are 4 ways for a script called script-name. py: • From command prompt - uncompiled • python script-name. py • From python interpreter – ensure to start python from directory of script for now. • import py_compile • py_compile(‘script-name. py’) • Compile from command line • python –m py_compile script-name. py • python –m compileall. • Py and Pyc files available
Indentation • Scope achieved through indentation – not brackets • Auto creation and interpretation of variables • i = 42 • i = i + 1 # 43 • print id(i) • Types • Numbers -> integers, long integers, floating point numbers, complex • Strings -> functions – concat (+), slicing [2: 4] • Operators – input, raw_input • Casting to list etc…
Conditional • if • else • max = (a > b) ? a : b; This is an abbreviation for the following C code • if (a > b) • max=a; • else • max=b; • C programmers have to get used to a different notation in Python • max = a if (a > b) else b;
Looping • #!/usr/bin/env python • n = 100 • sum = 0 • i=1 • while i <= n: • sum = sum + i • i=i+1 • print "Sum of 1 until %d: %d" % (n, sum)
Bibliography • Python for Data Analysis • Data Wrangling with Pandas, Num. Py, and Ipython • Wes Mc. Kinney, O’Reilly, 2012 • Programming Python, 4 th Edition • Powerful Object-Oriented Programming • Mark Lutz, O’Reilly, 2010 • Agile Data Science – Building Data Analytics with Hadoop • Russell Jurney, O’Reilly, 2013 • Functional Python Programming • Steven Lott, Pakt Publishing, 2015
Practice, Practice • From Lecture 1 you should be able to write a python script file to do calculations and print them to the screen • Write a program to print ‘Hello World’ to the screen • Write a program to sum the first 100 numbers • Write a program to multiply the first 10 numbers
Summary • Programming Languages for Big Data • Why Python • Hello World • Executing a Script • Indentation • Conditional Programming • Looping • Bibliography • Practice
- Slides: 11