USAGE OF BLAST AND IMPORTANCE OF BIOINFORMATICS i

USAGE OF BLAST AND IMPORTANCE OF BIOINFORMATICS i. How to use BLAST ii. Bioinformatics career discussion Carmelina Charalambous

OVERVIEW • What is Bioinformatics • Introduction to BLAST • Live BLAST demo • BLAST and Bioinformatics real-life application • Bioinformatics as a career choice • Personal academic and career pathway

WHAT IS BIOINFORMATICS? • It’s the interdisciplinary field between computer science, biology, information engineering, mathematics and statistics • To develops methods and software tools for understanding complex biological data and large datasets Fields in Bioinformatics: • 1. Sequence Analysis (DNA sequences analysis, Sequence assembly , Genome annotation , Computational evolutionary Biology , Comparative Genomics) • 2. Gene and protein expression • 3. Structural bioinformatics • 4. Network and systems biology • 5. Software and tools Development. • 6. Developing Databases • 8. Drug Designing and development • 9. Analysis of cellular organization • 10. Clinical Bioinformatics • 11. Pharmacogenomics

BLAST (Basic Local Alignment Search Tool) • Allows rapid sequence comparison of a query sequence against a database • The BLAST algorithm is fast, accurate, and web-accessible BLASTn • Nucleotide BLAST: compares one or more nucleotide query sequences to a database Application: To determine the evolutionary relationships among different organisms BLASTp • Protein BLAST: compares one or more protein query sequences to a database Application: Trying to identify a novel protein t. BLASTn BLASTx • Protein sequence searched against translated nucleotide sequences • Translated nucleotide sequence searched against protein sequences Application: Used for expressed sequence tags (ESTs) Application: first analysis performed with a newly determined nucleotide sequence

FOUR COMPONENTS TO A BLAST SEARCH (1) Choose the sequence (query) (2) Select the BLAST program (3) Choose the database to search (4) Choose optional parameters Then click “BLAST”

BLAST Output E-value: describes the number of hits one can "expect" to see by chance. The smaller the E-value the less likely this event has happened by chance Bit-score: is another statistical indicator used in conjunction to the E value. It measures sequence length similarity and database size Identity % How similar the query sequence is to the target sequence Gaps % allows deletions and insertions to be introduced

LIVE BLAST DEMONSTRATION There is a step-by step guide document on how to do this demo on your own time. We will start by clicking the link to the main National Center for Biotechnology Information (NCBI) website https: //www. ncbi. nlm. nih. gov

1. Identifying species 2. Locating domains 6. Drug discovery BLAST usage 3. Establishing phylogeny 5. Compare species 4. DNA mapping

IMPORTANCE OF BIOINFORMATICS AND BLAST APPLICATION I: DRUG DISCOVERY https: //www. researchgate. net/publication/2330725_What_is_bioinformatics_An_introduction_and_overview

Tumor and normal samples Lab prepare data Bioinformagicians turn massive files to human-friendly results Post sequencing Quality control IMPORTANCE OF BIOINFORMATIC S APPLICATION II: PERSONALIZED MEDICINE IN CANCER Hi. Seq sequencer Alignment to reference genome Variant calling Variant filtering Variant annotation Clinicians turn findings into therapeutic decisions Results back to scientists ~100 K variants ~10 driver variants

My academic/career path Graduated from High School in 2015 Bioinformatician at Addenbrookes Hospital NHS foundation 2020 -present Bioinformatics Trainee Assistant at Cambridge University 2019 BSc Biochemistry with Genetics Lancaster University, UK 2015 -2018 MPhil Genomic Medicine and Bioinformatics Cambridge University, UK 2018 -2019

A CAREER IN BIOINFORMATICS • Bioinformatics scientists have distinct personalities: • Investigative individuals, • Intellectual • Curious • Methodical • Rational • Analytical • Logical. http: //www. bioinformaticscareerguide. com/

QUESTIONS? Thank you for having me
- Slides: 13