UCSC Genome Browser Tutorial http genome ucsc edu
UCSC Genome Browser Tutorial http: //genome. ucsc. edu/ The UCSC Toolset & Portal to the Human Genome • Genome Browser • Table Browser “I was blind and now I can see” http: //cs 273 a. stanford. edu 1
UCSC Genome Browser [version 9 a] http: //www. openhelix. com/downloads/ucsc_home. shtml http: //cs 273 a. stanford. edu 2
Genome Browser helps visualize genome annotation • Simple genome sequence of limited use without functional annotation. GCTCTGAGATCTCCGGCTCCTTGGCCCGGGACTTTCTGCGCCCTGA Exon • The genome browser is a tool for visualizing genome annotation. http: //cs 273 a. stanford. edu 3
The UCSC Homepage: http: //genome. ucsc. edu navigate General information Specific information— new features, current status, etc. 4
The Genome Browser Gateway start page choices, December 2006 1 n 1. 2. 3. 2 3 Make your Gateway choices: Select Clade Select species: search 1 species at a time Assembly: the official backbone DNA sequence practically speaking, there is no such thing as a genome. there is only a genome assembly. assemblies update. frequently. think moving target. . . 5
Everything in Genomics is a Moving Target n n The genomes Their annotations The Portals Our understanding of Biology Conclusion: write code that can be run. . . and rerun 6
The Genome Browser Gateway start page, basic search 7
The Genome Browser Gateway start page choices, December 2006 4 5 6 n 1. 2. 3. 4. 5. 6. Make your Gateway choices: Select Clade Select species: search 1 species at a time Assembly: the official backbone DNA sequence Position: location in the genome to examine Image width: how many pixels in display window; 5000 max Configure: make fonts bigger + other choices 8
The Genome Browser Gateway start page, basic search 4 text/ID searches s e l p , am x e ow h e arc ns b se stio l fu gge p l e su l H n Use this Gateway to search by: n n n Gene names, symbols Chromosome number: chr 7, or region: chr 11: 1038475 -1075482 Keywords: kinase, receptor IDs: NP, NM, OMIM, and more… See lower part of page for help with format 9
The Genome Browser Gateway sample search for Human TP 53 n Sample search: human, March 2006 assembly, tp 53 select n n Select from results list ID search may go right to a viewer page, if unique 10
Overview of the whole Genome Browser page (mature release) } Genome viewer section Groups of data Mapping and Sequencing Tracks Genes and Gene Prediction Tracks m. RNA and EST Tracks Expression and Regulation Comparative Genomics Variation and Repeats ENCODE Tracks 11
Different species, different tracks, same software n n Species may have different data tracks Layout, software, functions the same 12
Sample Genome Viewer image, TP 53 region base position STS markers Known genes Ref. Seq genes Gen. Bank seqs 17 species compared single species compared SNPs repeats 13
Visual Cues on the Genome Browser Tick marks; a single location (STS, SNP) 3' UTR exon <<< exon < < < <ex 5' UTR Intron, and direction of transcription <<< or >>> Track colors may have meaning—for example, Known Gene track: • If there is a corresponding PDB entry, = black • If there is a corresponding NCBI Reviewed seq, = dark blue • If there is a corresponding NCBI Provisional seq, = light blue For some tracks, the height of a bar is increased likelihood of an evolutionary relationship (conservation track) 14
Options for Changing Images: Upper Section Walk left or right click to zoom 3 x and re-center n n n Zoom in Specify a position Zoom out fonts, window, more Change your view or location with controls at the top Use “base” to get right down to the nucleotides Configure: to change font, window size, more… 15
Annotation Track display options enforce changes Links to info and/or filters Change track view n Some data is ON or OFF by default n n Menu links to info about the tracks: content, methods You change the view with pulldown menus n After making changes, REFRESH to enforce the change 16
Annotation Track options, defined n Hide: removes a track from view n Dense: all items collapsed into a single line n Squish: each item = separate line, but 50% height + packed n Pack: each item separate, but efficiently stacked (full height) n Full: each item on separate line 17
Reset, Hide, Configure or Refresh to change settings enforce any changes (hide, full, squish…) reset, back to defaults n n n start from scratch You control the views Use pulldown menus Configure options page 18
Annotation Track options, if altered…. important point: the browser remembers! n n Session information (the position you were examining) Track choices (squish, pack, full, etc) Filter parameters (if you changed the colors of any items, or the subset to be displayed) …are all saved on your computer. When you come back in a couple of days to use it again, these will still be set. You may— or may not—intend this. To clear your “cart” or parameters, click default tracks OR 19
Saved Sessions 20
Click Any Viewer Object for Details Click the item New web page opens Example: click your mouse anywhere on the TP 53 line Many details and links to more data about TP 53 21
informative Click description other resource links annotation track item for details pages Not all genes have This much detail. links to sequences Different annotation tracks microarray data carry different data. m. RNA secondary structure protein domains/structure homologs in other species Gene Ontology™ descriptions m. RNA descriptions pathways 22
Get DNA, with Extended Case/Color Options n n n Use the DNA link at the top Plain or Extended options Change colors, fonts, etc. 23
Get Sequence from Details Pages Click a track, go to Sequence section of details page Click the line Click the item sequence section on detail page 24
Accessing the BLAT tool BLAT = BLAST-like Alignment Tool n n n Rapid searches by INDEXING the entire genome Works best with high similarity matches See documentation and publication for details n Kent, WJ. Genome Res. 2002. 12: 656 25
BLAT tool overview: www. openhelix. com/sampleseqs. html Make choices n n Paste one or more sequences DNA limit 25000 bases Protein limit 10000 aa 25 total sequences submit Or upload n 26
n n sorting Results with demo sequences, settings default; sort = Query, Score n n go to alignment detail go to browser/viewer BLAT results, with links Score is a count of matches—higher number, better match Click browser to go to Genome Browser image location (next slide) Click details to see the alignment to genomic sequence (2 nd slide) 27
BLAT results, browser link click to flip frame query n n From browser click in BLAT results A new line with your Sequence from BLAT Search appears! Watch out for reading frame! Click - - - > to flip frame Base position = full and zoomed in enough to see amino acids 28
BLAT results, alignment details Your query Genomic match, color cues Side-by-side alignment yours genomic 29
Understand Blat’s Limitation n n Blat was designed to rapidly align sequence from one genome back to itself (e. g. , EST/c. DNA data) It can and it does miss clear hits at times Blat actually allows for a single mismatch, but it also removes k-mers with excessive counts for efficiency. Not suitable for cross-species mapping. 30
Bunch More Goodies – Click Around 31
Bibliography: n http: //genome. ucsc. edu/golden. Path/pubs. html n n n The UCSC Genome Browser Database: update 2008, update 2007, and earlier. UCSC Genome Browser Tutorial UCSC Genome Browser: Deep support for molecular biomedical research The UCSC Known Genes, 2006. The UCSC Gene Sorter, 2007. Piloting the Zebrafish Genome Browser, 2006. 32
UCSC Genome Browser [version 9 a] 33
Genome Browser Database visualize search & download Underlying Database (My. SQL) Primary table: positions, names, etc. Auxiliary table: related data 34
The Table Browser Open browser http: //genome. ucsc. edu/ 35
Table Browser: Choose Genome In the Human genome (hg 16), search for simple repeats on a chromosome 4 location with copy number more than 10 and download the sequence. 36
Table Browser: Choose Table to Search Choose Data Table In the Human genome (hg 16), search for simple repeats on a chromosome 4 location with copy number more than 10 and download the sequence. 37
Table Browser: Describe Table Describe table 38
Table Browser: Choose Region to Search In the Human genome (hg 16), search for simple repeats on a chromosome 4 location with copy number more than 10 and download the sequence. 39
Table Browser: Upload Locations to Search Paste Upload 40
Table Browser: Filter to Refine Search Create Filter Submit Filter In the Human genome (hg 16), search for simple repeats on a chromosome 4 location with copy number more than 10 and download the sequence. 41
Table Browser: Output Data Output data In the Human genome (hg 16), search for simple repeats on a chromosome 4 location with copy number more than 10 and download the sequence. 42
Table Browser: Output Formats Text Fields Output formats 43
Table Browser: Fasta Sequence Output Sequence 44
Table Browser: Database Format Outputs Database 45
Table Browser: Custom Track Output Custom Track 46
Table Browser: Hyperlinks Output Hyperlinks 47
Table Browser: Obtaining Output Adding name creates file on desktop, leaving blank creates output in browser. (exception: custom track) Data Summary 48
Table Browser: Output configuration Sequence Format Get Sequence 49
Table Browser: Intersecting Data 2 nd Table Any Overlap Intersect Submit Find simple repeats (copy number > 10) within known genes and download the sequence. 50
Table Browser: Intersecting Data Narrows Search Filtered simple repeats Summary Filtered simple repeats, intersected (overlapping) w/ known genes 51
Table Browser: Downloading Sequence Data Sequence Format Get Sequence 52
Table Browser: Correlating Data Tables Get Results Correlate 2 Datasets 53
Custom Tracks: Table Browser Searches Create Track Get Output 54
Custom Tracks: Name and Configure Track Name Track: SRepeat. KGenes Describe Track: Intersection … Choose default view in browser Download track file to desktop In G eno m e. B row ser 55
Custom Tracks: Open Track in Genome Browser Open Details Compare “…caused by an expanded, unstable trinucleotide repeat…” 56
Custom Tracks: Track in Table Browser Custom tracks also are available for filtering and intersections on the Table Browser 57
Custom Tracks: User-generated Data in Track Custom Track How-to Custom Tracks Link 58
Custom Tracks: Four Steps to Create Track n. Four steps to create a custom track n. Define track characteristics n. Define browser characteristics n. Format your data n. Upload and view your track 59
Custom Tracks: Submit Track Submit File Copy and paste small or simple tracks http: //genome. ucsc. edu/FAQformat 60
Custom Tracks: Track Appears in Genome Browser 61
Custom Tracks: Track Characteristics Default view of custom track is “pack” Default view of other tracks set 62
Custom Tracks: Track Appears in Table Browser Custom Track also appears in Table Browser 63
Custom Tracks from Outside Sources Contributed Track Custom Tracks Link 64
Bibliography: n http: //genome. ucsc. edu/golden. Path/pubs. html n n The UCSC Table Browser, 2004. Bejerano et al. , Nature Methods, 2005. The UCSC Proteome Browser Phylogenomic Resources at the UCSC Genome Browser 65
- Slides: 65