Viking database Family tree of Gorm den Gamle

Viking database! • Family tree of Gorm den Gamle, Harald Blåtand, Svend Tveskæg. . • Build new class for representing this tree. . ? 1

No! tree. py We already have a generic tree class: Phylogeny_node 2

general_tree. py • Create copy with more general name • Build Royal class as a subclass of this class: – Needs same attributes and methods, plus perhaps more 3

royal_vikings. py (part 1) Overrides __str__ method of Node class • A Royal viking in a family tree has a father/parent node, a name, and a string representing the reigning period (if viking was queen/king)

royal_vikings. py (part 2) Test program Not queen/king: no reign given

Harald Hen, Knud den Hellige, Oluf Hunger, (s)on, si(b)ling, (p)rint, (q)uit? f Name: Parent: Siblings: Sons: (f)ather, Number of Estrid Svend Tveskæg Harald 2. , Knud den Store Svend Estridsen (s)on, si(b)ling, (p)rint, (q)uit? b sibling (0 -1)? 1 royal_vikings. py (part 3) [. . ] Name: Parent: Siblings: Sons: (f)ather, Svend Estridsen Estrid Navigating the family tree, starting with Niels Erik Ejegod, Niels Name: Knud den Store Parent: Svend Tveskæg Siblings: Harald 2. , Estrid Sons: Knud 3. Hardeknud (f)ather, (s)on, si(b)ling, (p)rint, (q)uit? p Knud den Store (1014 -1035) - Svend Tveskæg (987 -1014) - Harald Blåtand (958 -987) – Gorm den Gamle (? -958)

Another kind of tree: Newick trees 20. 59 ((monkey: 100. 85, cat: 47. 14): 20. 59); 100. 85 monkey 47. 14 cat 7

Project: Newick trees • Load and parse newick tree file – Need newick class • Newick node has name, list of sons, distance to father, sequence • Inherit from general_tree's Node class! – Need parser • Check that loaded tree corresponds to “current sequences” – Create (ID, sequence) dictionary from current seqs (efficient!) – After parsing tree file, traverse tree and look up sequence from each node ID, store in node – Give error message if ID not found • Calculate “Average Hamming error” 8

Project: Newick trees • Load and parse newick tree file – Need newick class • Newick node has name, list of sons, distance to father, sequence • Inherit from general_tree's Node class! – Need parser • Check that loaded tree corresponds to “current sequences” – Create (ID, sequence) dictionary from current seqs (efficient!) – After parsing tree file, traverse tree and look up sequence from each node ID, store in node – Give error message if ID not found • Calculate “Average Hamming error” 9

Average Hamming Error in tree CATAT 1/5 CGATAT 2/5 CGTAT 1/4 GTAT 1/6 CGAGAT • Average number of mismatches per alignment position over all alignments in tree • (2+1+1+1)/(5+6+5+4) = 5/20 = 0. 25 errors per alignment position 10

Newick_node derives from Node hamming. py (part 1) Exercise: Newick_node method CATAT CGTAT CGAGAT 11

hamming. py (part 1) CATAT mismatches = 0 CGATAT alignmentlength = 0 0/0 GTAT 2/5 CGTAT CGAGAT 12

hamming. py (part 1) CATAT mismatches = 2 CGATAT alignmentlength = 5 0/0 2/5 CGTAT 1/6 0/0 CGAGAT 13

hamming. py (part 1) CATAT mismatches = 3 alignmentlength = 11 CGATAT GTAT 1/6 CGTAT 0/0 CGAGAT 14

hamming. py (part 1) CATAT 3/11 CGATAT CGTAT CGAGAT 15

hamming. py (part 1) CATAT 3/11 1/5 CGATAT CGTAT 1/4 0/0 GTAT CGAGAT 16

hamming. py (part 1) 5/20 CATAT 3/11 1/5 CGATAT CGTAT 1/4 0/0 GTAT CGAGAT 17

Average Hamming Error CATAT hamming. py (part 2) CGATAT CGTAT CGAGAT Average Hamming error: 0. 250 18

. . on to the exercises 19
- Slides: 19