Introduction to Python BCHB 524 Lecture 5 BCHB

Introduction to Python BCHB 524 Lecture 5 BCHB 524 - Edwards

Outline l Homework #2 Solutions Homework #1 Notes l DNA as a string l l Extracting codons in DNA Counting in-frame codons in DNA Reverse Complement BCHB 524 - Edwards 2

Homework #1 Notes l Python programs: l l Upload. py files Don't paste into comment box Don't paste into your writeup Writeup: l l l Upload. txt files, Don't paste into comment box Text document preferred BCHB 524 - Edwards 3

Homework #1 Notes l Multiple submissions: l l OK, but… …I'll ignore all except the last one Make each (re-)submission complete Grading: l l l Random grading order Comments Grading "curve" BCHB 524 - Edwards 4

Review l l Printing and execution Variables and basic data-types: l l Functions, using/calling and defining: l l l integers, floats, strings Arithmetic with, conversion between String characters and chunks, string methods Use in any expression Parameters as input, return for output Control Flow: l l if statements – conditional execution for statements – iterative execution BCHB 524 - Edwards 5

DNA as a string seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg" seqlen = len(seq) # set i to 0, 3, 6, 9, . . . , 36 for i in range(0, seqlen, 3): # extract the codon as a string codon = seq[i: i+3] print(codon) print("Number of Met. amino-acids", seq. count("atg")) BCHB 524 - Edwards 6

DNA as a string l What about upper and lower case? l l Differences between DNA and RNA sequence? l l Substitute U for each T? How about ambiguous nucleotide symbols? l l ATG vs atg? What should we do with ‘N’ and other ambiguity codes (R, Y, W, S, M, K, H, B, V, D)? Strings don’t know any biology! BCHB 524 - Edwards 7

DNA as a string seq = "gcatgacgttattacgactctgtgtggcgtctgctgggg" def in. Frame. Met(seq): seqlen = len(seq) count = 0 for i in range(0, seqlen, 3): codon = seq[i: i+3] if codon. upper() == "ATG": count = count + 1 return count print("Number of Met. amino-acids", in. Frame. Met(seq)) BCHB 524 - Edwards 8

DNA as a string input_seq = "catgacgttattacgactctgtgtggcgtctgctgggg" def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides. find(nuc) comp = complements[i] return comp def reverse. Complement(seq): newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq print("Reverse complement: ", reverse. Complement(input_seq)) BCHB 524 - Edwards 9

DNA as a string input_seq = "catgacgttattacgactctgtgtggcgtctgctgggg" def complement(nuc): nucleotides = 'ACGT' complements = 'TGCA' i = nucleotides. find(nuc) if i >= 0: comp = complements[i] else: comp = nuc return comp def reverse. Complement(seq): seq = seq. upper() newseq = "" for nuc in seq: newseq = complement(nuc) + newseq return newseq print("Reverse complement: ", BCHB 524 reverse. Complement(input_seq)) - Edwards 10