CSCI 69004900 Special Topics in Computer Science Automata
- Slides: 16
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics problems • sequence comparison • pattern/structure search • pattern/structure recognition • relationship of sequences Algorithm design • optimal algorithms • heuristic algorithms • parallel algorithms Probabilistic models • stochastic finite state automata (HMMs) • stochastic regular grammars • stochastic context-free grammars • more complex grammar models
Probabilistic modeling and algorithms M: modeling a family of sequences (e. g. RNA) to capture certain properties Q 1, Q 2, …. (1) Each sequence x possesses a property Qk(x) with probability Pk(x) (2) A probability distribution for each sequence x over the properties, i. e. , ∑k Pk(x) = 1 for each given x (3) The most likely property Q*(x) is one with the highest probability, i. e. , Q*(x) = arg maxk { Pk(x) } (4) Algorithms are designed to find the most likely property for given sequences. But how? D (sample, training data) M assigning probs Modeling mechanism Computational linguistic systems can describe desired properties of bio sequences
Outline for the course • Part 0: molecular biology basics and review of probability theory • Part 1: pairwise alignment, HMMs, profile-HMMs, gene finding, and multiple alignment (chapters 1 -6) potential research projects: efficient HMM algorithms, gene finding • Part 2: RNA stem-loops, SCFG, secondary structure prediction, structural homology search (chapters 9 -10) potential research projects: efficient SCFG algorithms, pseudoknot prediction, protein secondary structure prediction • Part 3: phylogeny reconstruction, probabilistic approaches (chapters 7 -8) potential research projects: grammar modeling of evolution
The ways this course is to be conducted • To learn new concepts and techniques Lectures (by the instructor and students) • To apply learned knowledge to research Research discussions (lead by students and the instructor) • To demonstrate learning effectiveness Presentations of research results (by students)
The central dogma of molecular biology
Building blocks of DNA Nucleotides • Purines Adenine, Guanine • Pyrimidines Cytosine, Thymine
Double helix of DNA
DNA replication
Genetic code
Mutations (1) synonymous (2) Missense (3) nonsense (4) frame-shift
RNA synthesis
RNA synthesis (cont’)
RNA can fold to itself
Protein synthesis
Biological information flow Genome AGACGCTGGTATCGCAT TAACGGGTTACTC GGATATTACCTTACTAT AGGGCGCTATCGCGCGT TAATCTGGTATC Introns Exons Gene sequence Regulatory DNA sequence Protein-DNA interactions Gene expression Protein abundance Gene regulation Protein sequence Protein structure Sequence family Structure family Protein-protein interactions Protein function Cellular role
What bioinformatics is NOT: • Not just using a computer to speed up biology • Not just applying computer algorithms to biology • Not just the accountant of genomic data What bioinformatics is then: • The creative use of computers to define and solve central biological puzzles • The computer becomes an hypothesis machine, making predictions to be tested at the bench.
- Advanced topics in computer science
- 3232 special investigative topics
- Dena schlosser
- Special topics in software engineering
- My favorite subject is science แปลว่า
- Primary 3 science syllabus
- Soil science ppt
- Science topics for presentation
- Ib sports, exercise and health science topics
- Ib sports exercise and health science file
- Computer organization course
- Computer and society topics
- Csci 4211
- Csci 530
- Csci 530
- Netcheque
- Csci 430