A Brief Introduction to Scientific Programming with Python
A Brief Introduction to Scientific Programming with Python TCD, 26/08/2015 Karsten Hokamp, Ph. D TCD Bioinformatics Support Team Trinity College Dublin, The University of Dublin
Overview • Programming • First Python script/program • Why Python? • Bioinformatics examples • Additional resources • Outlook Trinity College Dublin, The University of Dublin
What is programming and why bother? § Data processing § Automation § Combination of programs for analysis pipelines § More control and flexibility § Better understanding of how programs work Trinity College Dublin, The University of Dublin
Programming Concepts § Turn into a very meticulous problem solver § Break problems into small details § Keep it variable § Give very precise instructions Trinity College Dublin, The University of Dublin
Programming Concepts "human" recipe Trinity College Dublin, The University of Dublin
Programming Concepts "computerised" recipe Trinity College Dublin, The University of Dublin
Mac for Windows users The main differences: § cmd instead of ctrl (e. g. cmd-C for copying) § right-click mouse: ctrl-click § # character: alt-3 § switch between applications: cmd-tab § Spotlight (top right) for finding files/programs § Apple symbol (top left) for logging out Trinity College Dublin, The University of Dublin
IDLE: Integrated Deve. Lopment Environment open through Spotlight Trinity College Dublin, The University of Dublin
IDLE: Integrated Deve. Lopment Environment Trinity College Dublin, The University of Dublin
IDLE: Integrated Deve. Lopment Environment Alternatively: open through Finder Trinity College Dublin, The University of Dublin
IDLE: Integrated Deve. Lopment Environment interactive Python console Trinity College Dublin, The University of Dublin
IDLE: Integrated Deve. Lopment Environment simple Python statement Trinity College Dublin, The University of Dublin
IDLE: Integrated Deve. Lopment Environment user input output Trinity College Dublin, The University of Dublin
IDLE: Integrated Deve. Lopment Environment user input output try a few simple numeric operations Trinity College Dublin, The University of Dublin
IDLE: Integrated Deve. Lopment Environment repeat/combine previous commands by clicking into them and hitting return (use left/right arrows and delete to edit them) Trinity College Dublin, The University of Dublin
IDLE: Integrated Deve. Lopment Environment Console vs Editor Console Editor interactive requires extra click for running great for trying out code additional IDLE functionality not suited for long scripts no saving of code allows to save code Trinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts open a new file Trinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts write some code Trinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts run your code shortcut: F 5 Trinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts save file first Trinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts specify a file name Trinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts write more code IDLE provides help Trinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts save and run: cmd-S then F 5 Trinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts make it personal Trinity College Dublin, The University of Dublin
IDLE: Writing Python Scripts keep going Trinity College Dublin, The University of Dublin
Python vs Perl the equivalent in Perl Trinity College Dublin, The University of Dublin
Python vs Perl the equivalent in Perl Trinity College Dublin, The University of Dublin
Python vs Perl Python • fewer special characters • indentation enforced • more user-friendly functions Trinity College Dublin, The University of Dublin Perl
Why Python? § easy to learn great for beginners § enforces clean coding great for teachers § comes with IDE avoids command-line usage § object-orientated code reuse and recycling § very popular many peers § Bio. Python many bioinformatics modules Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example built-in function 'len' Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example built-in function 'set' Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example built-in functions 'sorted' and 'set' Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example string method 'count' Trinity College Dublin, The University of Dublin
Simple Bioinformatics Example string method 'upper' Trinity College Dublin, The University of Dublin
§ Basic sequence manipulation § Fetch records from databases § Multiple sequence alignment (Clustal, Muscle) § Sequence similarity search (Blast) § Working with motifs: MEME, Jaspar, Transfac § Phylogenetics § Clustering § Visualisation Trinity College Dublin, The University of Dublin
§ Parsing Gen. Bank records: from Bio import Seq. IO record = Seq. IO. read("AE 014613. 1. gb", "genbank") record. description 'Salmonella enterica subsp. enterica serovar Typhi Ty 2, complete genome. ' len(record. features) 9086 Trinity College Dublin, The University of Dublin
§ Parsing sequence records: from Bio import Seq. IO for entry in Seq. IO. parse("tlr 4_protein. fa", "fasta") : print(entry. description) print(len(entry), 'bp') gi|765368240|gb|AJR 32867. 1| TLR 4 [Gallus gallus] 843 bp gi|111414439|gb|ABH 09759. 1| toll-like receptor 4 [Bos taurus] 841 bp gi|6175873|gb|AAF 05316. 1|AF 177765_1 toll-like receptor 4 [Homo sapiens] 839 bp … Trinity College Dublin, The University of Dublin
§ Graphics: Chromosomes colour-coded by GC content (Bioinformatics with Python Cookbook) Trinity College Dublin, The University of Dublin
§ Graphics: Coloured phylogenetic tree from Ebola sequences (Bioinformatics with Python Cookbook) Trinity College Dublin, The University of Dublin
Additional Resources https: //store. continuum. io/cshop/anaconda/ Trinity College Dublin, The University of Dublin
Visualisations with Matplotlib http: //matplotlib. org/gallery. html Trinity College Dublin, The University of Dublin
Examples http: //scikit-learn. org Trinity College Dublin, The University of Dublin
Scikit-learn – Machine Learning in Python • Machine Learning: PCA of Iris data set http: //scikit-learn. org/stable/auto_examples/decomposition/plot_pca_iris. html Trinity College Dublin, The University of Dublin
Python Help Trinity College Dublin, The University of Dublin
Online courses § http: //biopython. org/DIST/docs/tutorial/Tutorial. html § http: //dowell. colorado. edu/education-python. html § http: //www. pasteur. fr/formation/infobio/python § https: //www. codecademy. com/tracks/python § http: //anh. cs. luc. edu/python/hands-on/ § https: //www. coursera. org Trinity College Dublin, The University of Dublin
Books Trinity College Dublin, The University of Dublin
Conclusions • You have been briefly introduced to Python and IDLE. • You have learnt about programming concepts. • You have seen examples of what can be accomplished through Python. • Topics of an extensive Python course: • Coding in Python – variables, scope, functions… • Bioinformatics with Bio. Python • Automated biological data analysis – your interests! Trinity College Dublin, The University of Dublin
Thank You! http: //bioinf. gen. tcd. ie/workshops/python Trinity College Dublin, The University of Dublin
Don't forget to log out! Trinity College Dublin, The University of Dublin
- Slides: 49