Public data and tool repositories Section 2 Genome
Public data and tool repositories Section 2 Genome Browsers
Problems from last section 1. Query Entrez Gene with the following two queries separately and then explain the differences between the two results using a logical NOT operation: a) tyrosine kinase[Gene Ontology] AND human[Organism] b) cd 00192[Domain Name] AND human[Organism] 2. Retrieve the APP gene record from NCBI and use the Display dropdown menu to display Conserved Domain Links. Use the ids of the listed domains to query Entrez Gene for records with the same domains. 3. Use the SNP Geneview link at NCBI to identify coding SNPs in the APP gene. Which SNP is missing from this display which was present in the Ensembl APP protein record? 4. Use the Homologene link at NCBI to identify possible functional orthologs for human APP. How does this list compare to the Ensembl list of orthologs that we reviewed previously?
Review of last section example: human APP gene 1. NCBI Entrez databases a) Constructing queries b) Gene, Nucleotide and Protein c) Ref. Seq 2. EBI/Ensembl a) Finding genes b) Viewing Genes, Transcripts, Exons, Proteins and SNPs 3. Common id and data formats
This section 1. Genome assembly and genome browsers 2. Promoter/enhancer analysis example 3. More information
Genome Build Process 1. Organism sequence data is assembled into contiguous pieces (contigs) 2. Contigs are mapped to genomic features and the coordinate system is assigned 3. Unmapped sequence data be assigned to artificial chromosomes 4. Assembly is improved as more sequence data is available Entrez Genome Project
Genome Browsers 1. Make millions of sequences available through easily accessible, user-friendly interfaces 2. Provide genomic sequence, exon structure, m. RNA sequence, EST and SNP data via web-based text search interfaces 3. Options available for local installs
Commonly Used Browsers 1. The Entrez Map Viewer 2. The EBI/Ensembl browser 3. The UCSC genome browser
NCBI Map Viewer 1. Integrates feature identity information with whole genome view 2. Allows one to view and search an organism's complete genome 3. Displays chromosome maps 4. User can zoom into progressively greater levels of detail, down to the sequence data for a region of interest. 5. Focus more on individual sequences Ex: Looking at the APP gene in the NCBI Map Viewer
EBI/Ensembl Browser 1. Provides access to sequence data from ~40 organisms 2. Includes the human genome sequence and data from all the commonly used experimental organisms 3. Displays the location of genes, variations and other sequence features within genomes 4. Greatest strengths: a) browsing of large genomic contigs b) comparative genomic features Ex: Looking at the APP gene in the EBI/Ensembl Browser
UCSC Genome Browser Strength is genome position-based data aggregation: 1. Data positioned on “best” genome build and organised into “tracks” 2. Outside data tracks 1. 2. 3. 4. 5. 3. Inside data tracks 1. 2. 4. Genome builds Genes, known and predicted m. RNA Expression and regulation Variations and repeats Known Genes Comparative genomics Custom tracks Ex: Looking at the APP gene in the UCSC Genome Browser
APP Upstream Region 15 kb Ex: Extracting and aligning human and mouse APP upstream regions
Promoter/enhancer analysis approaches 1. Same gene, multiple species a) b) c) Assumed evolutionary conservation of non-coding regions Can use pairwise or multiple alignment method Examples: i. ii. 2. Precomputed: UCSC conservation tracks Dynamic: eg, r. Vista Different genes, same species a) b) c) d) Typical output as co-expressed clusters from microarray data Looking for over-represented, small binding sites Much better results if looking for a pattern or clustering of multiple sites Motif-finding algorithm, eg, MEME
Tutorials 1. NCBI • • • Field Guide Information and tutorials Science Primer 2. EBI • 2 Can Tutorials 3. UCSC • Genome Browser User’s Guide 4. Bulk Downloads • Bulk Downloads Tutorial
IN CLASS EXERCISE 1. Do all three browsers show the same number of transcript variants for: APP, EGFR, TP 53? 2. How many SNPs appear in the 5’ UTR of APP? 3. What is the lowest conservation score in APP exon 2?
- Slides: 14