Comparative Genomics Gene Regulatory Networks GRNs Anil Jegga

  • Slides: 64
Download presentation
Comparative Genomics Gene Regulatory Networks (GRNs) Anil Jegga Biomedical Informatics Contact Information: Anil Jegga

Comparative Genomics Gene Regulatory Networks (GRNs) Anil Jegga Biomedical Informatics Contact Information: Anil Jegga Session 2: Biomedical Informatics Room # 232, S Building 10 th Floor February 24, 2012 CCHMC Homepage: http: //anil. cchmc. org Tel: 513 -636 -0261 E-mail: anil. jegga@cchmc. org Additional exercise available at: http: //anil. cchmc. org/grn. html 2/24/2012 Jegga Biomedical Informatics 1

Session 1: Overview of GRNs (Feb 23) a. Computational Approaches b. Cis-Element Identification c.

Session 1: Overview of GRNs (Feb 23) a. Computational Approaches b. Cis-Element Identification c. Comparative Genomics d. Regulatory region variations e. p 53 case study Session 2: Database Session (Feb 24) a. Genome Browsers b. Promoter Analysis, TFBS Search c. Co-regulated gene analysis 2/24/2012 Jegga Biomedical Informatics 2

Session 2 (Databases/Servers) Feb 24, 2012 a. Genome Browsers b. Promoter Analysis, TFBS Search

Session 2 (Databases/Servers) Feb 24, 2012 a. Genome Browsers b. Promoter Analysis, TFBS Search c. Co-regulated gene analysis

Genome Browser (http: //genome. ucsc. edu) 2/24/2012 Jegga Biomedical Informatics 4

Genome Browser (http: //genome. ucsc. edu) 2/24/2012 Jegga Biomedical Informatics 4

Genome Browser (http: //genome. ucsc. edu) 1 2 3 4 5 6 Genome Browser

Genome Browser (http: //genome. ucsc. edu) 1 2 3 4 5 6 Genome Browser Gateway choices: 1. Select Clade 2. Select genome/species: You can search only one species at a time 3. Assembly: the official backbone DNA sequence 4. Position: location in the genome to examine or search term (gene symbol, accession number, etc. ) 5. Image width: how many pixels in display window; 5000 max 2/24/2012 Jegga Biomedical Informatics 6. Configure: make fonts bigger + other options 5

Genome Browser (http: //genome. ucsc. edu) 2/24/2012 Jegga Biomedical Informatics 6

Genome Browser (http: //genome. ucsc. edu) 2/24/2012 Jegga Biomedical Informatics 6

Explore the tracks Genome Browser (http: //genome. ucsc. edu) 2/24/2012 Jegga Biomedical Informatics 7

Explore the tracks Genome Browser (http: //genome. ucsc. edu) 2/24/2012 Jegga Biomedical Informatics 7

2/24/2012 Jegga Biomedical Informatics 8

2/24/2012 Jegga Biomedical Informatics 8

What if I want to download promoter sequences for several genes at a time?

What if I want to download promoter sequences for several genes at a time? 2/24/2012 Jegga Biomedical Informatics 9

Genome Browser (http: //genome. ucsc. edu) 2/24/2012 Jegga Biomedical Informatics 10

Genome Browser (http: //genome. ucsc. edu) 2/24/2012 Jegga Biomedical Informatics 10

Genome Browser (http: //genome. ucsc. edu) 1 2 3 6 4 5 2/24/2012 Jegga

Genome Browser (http: //genome. ucsc. edu) 1 2 3 6 4 5 2/24/2012 Jegga Biomedical Informatics 11

2/24/2012 Jegga Biomedical Informatics 12

2/24/2012 Jegga Biomedical Informatics 12

Other Genome Browsers: ENSEMBL http: //www. ensembl. org 2/24/2012 Jegga Biomedical Informatics 13

Other Genome Browsers: ENSEMBL http: //www. ensembl. org 2/24/2012 Jegga Biomedical Informatics 13

I have a promoter sequence and how do I scan it for known TFBSs?

I have a promoter sequence and how do I scan it for known TFBSs? 2/24/2012 Jegga Biomedical Informatics 14

JASPAR: http: //jaspar. genereg. net 2/24/2012 Jegga Biomedical Informatics 15

JASPAR: http: //jaspar. genereg. net 2/24/2012 Jegga Biomedical Informatics 15

JASPAR: http: //jaspar. genereg. net 2/24/2012 Jegga Biomedical Informatics 16

JASPAR: http: //jaspar. genereg. net 2/24/2012 Jegga Biomedical Informatics 16

JASPAR: http: //jaspar. genereg. net 2/24/2012 Jegga Biomedical Informatics 17

JASPAR: http: //jaspar. genereg. net 2/24/2012 Jegga Biomedical Informatics 17

Gene-Regulation: http: //www. gene-regulation. com Need to have an account (free for academic use)

Gene-Regulation: http: //www. gene-regulation. com Need to have an account (free for academic use) 2/24/2012 Jegga Biomedical Informatics 18

How can I identify putative regulatory regions for a gene or micro. RNA? 2/24/2012

How can I identify putative regulatory regions for a gene or micro. RNA? 2/24/2012 Jegga Biomedical Informatics 19

I have found a mi. RNA enriched in my gene list or I am

I have found a mi. RNA enriched in my gene list or I am interested in a specific gene and I want to identify putative regulatory regions for mi. RNA/gene Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 20

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 21

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 21

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 22

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 22

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 23

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 23

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 24

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 24

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 25

Genome. Trafac: http: //genometrafac. cchmc. org 2/24/2012 Jegga Biomedical Informatics 25

DCODE: http: //www. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 26

DCODE: http: //www. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 26

ECR Browser: http: //ecrbrowser. dcode. org/ Multispecies (not limited to pairwise comparisons) 2/24/2012 Jegga

ECR Browser: http: //ecrbrowser. dcode. org/ Multispecies (not limited to pairwise comparisons) 2/24/2012 Jegga Biomedical Informatics 27

ECR Browser: http: //ecrbrowser. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 28

ECR Browser: http: //ecrbrowser. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 28

ECR Browser: http: //ecrbrowser. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 29

ECR Browser: http: //ecrbrowser. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 29

ECR Browser: http: //ecrbrowser. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 30

ECR Browser: http: //ecrbrowser. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 30

ECR Browser: http: //ecrbrowser. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 31

ECR Browser: http: //ecrbrowser. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 31

RESOURCES - URLs: Summary Application/Resource URL Genome Browser http: //genome. ucsc. edu JASPAR http:

RESOURCES - URLs: Summary Application/Resource URL Genome Browser http: //genome. ucsc. edu JASPAR http: //jaspar. genereg. net/ Gene Regulation http: //www. gene-regulation. com Genome. Trafac http: //genometrafac. cchmc. org DCODE http: //www. dcode. org/ 2/24/2012 Jegga Biomedical Informatics 32

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the shared cis-elements – Known and Novel q Known transcription factor binding sites (TFBS) v Conserved • o. POSSUM 1. Each of these applications • Di. RE support different forms of v Non-conserved input. Very few support probeset IDs. • Pscan 2. Red Font: Input sequence • Mat. Inspector (*Licensed) required; Do not support q Unknown TFBS or Novel motifs gene symbols, gene IDs, or accession numbers. The v Conserved advantage is you can use • o. POSSUM them for scanning sequences • Weeder-H from any species. 3. *Licensed software: We have v Non-conserved access to the licensed version. • MEME 2/24/2012 • Jegga Biomedical Informatics 33 Weeder

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the shared cis-elements – Known and Novel q Known transcription factor binding sites (TFBS) v Conserved • o. POSSUM • Di. RE v Non-conserved • Pscan • Mat. Inspector (*Licensed) q Unknown TFBS or Novel motifs v Conserved • o. POSSUM • Weeder-H v Non-conserved • MEME 2/24/2012 • Jegga Biomedical Informatics 34 Weeder

o. POSSUM (http: //burgundy. cmmt. ubc. ca/o. POSSUM/) Supports human and mouse 2/24/2012 Jegga

o. POSSUM (http: //burgundy. cmmt. ubc. ca/o. POSSUM/) Supports human and mouse 2/24/2012 Jegga Biomedical Informatics 35

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) Disadvantage: Supports either human or mouse only

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) Disadvantage: Supports either human or mouse only 2/24/2012 Jegga Biomedical Informatics 36

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) The JASPAR PHYLOFACTS database consists of 174

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) The JASPAR PHYLOFACTS database consists of 174 profiles that were extracted from phylogenetically conserved gene upstream elements. They are a mix of known and as of yet undefined motifs. When should it be used? They are useful when one expects that other factors might determine promoter characteristics and/or tissue specificity. 2/24/2012 Jegga Biomedical Informatics 37

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) The Fisher statistic reflects the proportion of

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) The Fisher statistic reflects the proportion of genes that contain the TFBS compared to background. The Z-score statistic reflects the occurrence of the TFBS in the promoters of the co-expressed set compared to background. 2/24/2012 Jegga Biomedical Informatics 38

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 39

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 39

2/24/2012 Jegga Biomedical Informatics 40

2/24/2012 Jegga Biomedical Informatics 40

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 41

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 41

Di. RE (http: //dire. dcode. org/) 2/24/2012 Jegga Biomedical Informatics 42

Di. RE (http: //dire. dcode. org/) 2/24/2012 Jegga Biomedical Informatics 42

Di. RE (http: //dire. dcode. org/) ECR-Browser (http: //ecrbrowser. dcode. org/) 2/24/2012 Jegga Biomedical

Di. RE (http: //dire. dcode. org/) ECR-Browser (http: //ecrbrowser. dcode. org/) 2/24/2012 Jegga Biomedical Informatics 43

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 44

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 44

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 45

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 45

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 46

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 46

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 47

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 47

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 48

Pscan (http: //159. 149. 109. 9/pscan) 2/24/2012 Jegga Biomedical Informatics 48

Pscan (http: //159. 149. 109. 9/pscan) Comparing different input gene sets: 1. In the

Pscan (http: //159. 149. 109. 9/pscan) Comparing different input gene sets: 1. In the detailed output for a given matrix, you can compare the results obtained with the matrix on the gene set just submitted with the results the matrix had produced on another gene set. The latter could be a "negative" gene set (or vice versa ). 2. To perform the comparison, you have to fill in the "Compare with. . . " box fields with mean, standard deviation and sample size values of the other analysis - for the current one you can find them in the "Sample Data Statistics" box or in the overall text output that can be downloaded from the main output page. 3. Warning: Make sure that the values you input are correct, and especially that they were obtained by using the same matrix. Once you have clicked the "Go!" button, an output window will pop up and report if either of the two means is significantly higher than the other, together with a confidence pvalue computed with a Welch t-test. 2/24/2012 Jegga Biomedical Informatics 49

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the shared cis-elements – Known and Novel q Known transcription factor binding sites (TFBS) v Conserved • o. POSSUM • Di. RE v Non-conserved • Pscan • Mat. Inspector (*Licensed) q Unknown TFBS or Novel motifs v Conserved • o. POSSUM • Weeder-H v Non-conserved • MEME 2/24/2012 • Jegga Biomedical Informatics 50 Weeder

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 51

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 51

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) The JASPAR PHYLOFACTS database consists of 174

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) The JASPAR PHYLOFACTS database consists of 174 profiles that were extracted from phylogenetically conserved gene upstream elements. They are a mix of known and as of yet undefined motifs. When should it be used? They are useful when one expects that other factors might determine promoter characteristics and/or tissue specificity. 2/24/2012 Jegga Biomedical Informatics 52

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 53

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 53

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 54

o. POSSUM (http: //www. cisreg. ca/o. POSSUM) 2/24/2012 Jegga Biomedical Informatics 54

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the shared cis-elements – Known and Novel q Known transcription factor binding sites (TFBS) 1. Each of these applications v Conserved support different forms of • o. POSSUM input. Very few support probeset IDs. • Di. RE 2. Red Font: Input sequence v Non-conserved required; Do not support • Pscan gene symbols, gene IDs, or accession numbers. The • Mat. Inspector (*Licensed) advantage is you can use q Unknown TFBS or Novel motifs them for scanning sequences v Conserved from any species. 3. *Licensed software: We have • o. POSSUM access to the licensed version. • Weeder-H v Non-conserved How to fetch • MEME promoter/upstream sequence – 2/24/2012 • Jegga Biomedical Informatics 55 Weeder

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the

I have a list of co-expressed m. RNAs (Transcriptome)…. I want to find the shared cis-elements – Known and Novel q Known transcription factor binding sites (TFBS) 1. Each of these applications v Conserved support different forms of • o. POSSUM input. Very few support probeset IDs. • Di. RE 2. Red Font: Input sequence v Non-conserved required; Do not support • Pscan gene symbols, gene IDs, or accession numbers. The • Mat. Inspector (*Licensed) advantage is you can use q Unknown TFBS or Novel motifs them for scanning sequences v Conserved from any species. 3. *Licensed software: We have • o. POSSUM access to the licensed version. • Weeder-H v Non-conserved Use the fetched promoter/upstream • MEME sequences for the following analyses 2/24/2012 • Jegga Biomedical Informatics 56 Weeder

Weeder. H (http: //159. 149. 109. 9/pscan) 1. Supports large number of species. 2.

Weeder. H (http: //159. 149. 109. 9/pscan) 1. Supports large number of species. 2. Does not support multiple sequences (multifasta) input. You have to enter each sequence separately. 3. Good for small number of sequences where you expect a potential novel (or not included in the TFBS libraries) conserved motif. 2/24/2012 Jegga Biomedical Informatics 57

Weeder (http: //159. 149. 109. 9/modtools/) Do not use Groupwise mail when submitting large

Weeder (http: //159. 149. 109. 9/modtools/) Do not use Groupwise mail when submitting large number of sequences because the results are sent “in the mail” and not as an attachment. And Groupwise mail truncates messages if they are very long. Use Gmail instead. A link to the results page used to be sent earlier. 2/24/2012 Jegga Biomedical Informatics 58

Weeder (http: //159. 149. 109. 9/modtools/) 2/24/2012 Jegga Biomedical Informatics 59

Weeder (http: //159. 149. 109. 9/modtools/) 2/24/2012 Jegga Biomedical Informatics 59

MEME (http: //meme. sdsc. edu) MEME takes as input a group of DNA or

MEME (http: //meme. sdsc. edu) MEME takes as input a group of DNA or protein sequences and outputs as many motifs as requested. MEME uses statistical modeling techniques to automatically choose the best width, number of occurrences, and description for each motif. Your MEME results consist of: • your MEME results in HTML format • your MEME results in XML format • your MEME results in TEXT format • and the MAST results of searching your input sequences for the motifs found by MEME using MAST. 2/24/2012 Jegga Biomedical Informatics 60

MEME (http: //meme. sdsc. edu) 2/24/2012 Jegga Biomedical Informatics 61

MEME (http: //meme. sdsc. edu) 2/24/2012 Jegga Biomedical Informatics 61

MEME (http: //meme. sdsc. edu) TOMTOM can be used to find out if an

MEME (http: //meme. sdsc. edu) TOMTOM can be used to find out if an overrepresented motif in your sequences matches or is similar to a known TFBS 2/24/2012 Jegga Biomedical Informatics 62

Summary Cis-Element Finding Matrix CONSERVED KNOWN TFBS o. POSSUM Di. RE NOVEL/UNKNOWN o. POSSUM

Summary Cis-Element Finding Matrix CONSERVED KNOWN TFBS o. POSSUM Di. RE NOVEL/UNKNOWN o. POSSUM TFBS OR MOTIFS WEEDER-H 2/24/2012 Jegga Biomedical Informatics NON-CONSERVED Pscan Mat. Inspector* MEME WEEDER 63

RESOURCES - URLs: Summary Application/Resource URL o. POSSUM http: //burgundy. cmmt. ubc. ca/o. POSSUM/

RESOURCES - URLs: Summary Application/Resource URL o. POSSUM http: //burgundy. cmmt. ubc. ca/o. POSSUM/ Di. RE http: //dire. dcode. org/ Weeder-H http: //159. 149. 109. 9/modtools/ Weeder http: //159. 149. 109. 9/modtools/ Pscan http: //159. 149. 109. 9/pscan MEME http: //meme. sdsc. edu/ Mat. Inspector http: //www. genomatix. de/ Genome Browser http: //genome. ucsc. edu ECR Browser http: //ecrbrowser. dcode. org Additional exercise available at: http: //anil. cchmc. org/grn. html 2/24/2012 Jegga Biomedical Informatics 64