BI 420 Introduction to Bioinformatics Gene Expression Analysis
BI 420 – Introduction to Bioinformatics Gene Expression Analysis Department of Biology, Boston College
Why study gene expression? Which genes are active • at different developmental stages? • in cells of different tissues? • at different time points in the same cell? • cells under different environmental conditions? • between normal and cancerous cells?
Calling Differential Expression Challenges In Measuring Expression Differential Expression Is the difference in expression between the test and the control greater than the uncertainty in the measurement?
Gene expression naturally bounces around a lot
Methods For Measuring Gene Expression (all genes in the cell) • Microarrays (older, cheaper) • Sequenced based measurement RNASeq (replacing microarrays)
Expression microarray movie DNA microarray chip animation: http: //www. bio. davidson. edu/Courses/genomics/chip. html
What are expression microarrays?
Expression microarrays – “physical appearance”
c. DNA preparation
Expression assay
Chip readout – absolute expression and ratio
Chip readout – relative transcription
Chip readout – example
Time course experiments Experiment: measuring gene expression as oxygen gets depleted in yeast grown in a closed container
Time course data
Data analysis – normalization • balance fluorescent intensities of two dyes • adjust for differences in experimental conditions
Normalization
Log 2 transformation Double or half expression now has the same magnitude
Clustering – intro • Why: if the expression pattern for gene B is similar to gene A, maybe they are involved in the same or related pathway • How: Re-order expression vectors in the data set so that similar patterns are together
Clustering – numerical
Clustering – visual
Hierarchical clustering: pair-wise similarity
Hierarchical clustering: cluster construction
Clustering – large example
Application of microarrays: classification of cancers
RNA Seq
Measuring Gene Expression With RNA Seq |------Annotated Gene------| Genome • ACCCAATTTTCTGAAAATATCCGTGTCTTCCAG • Align reads • Count the number of reads that align uniquely within the regions of annotated genes (Shotgun, Cap-Trap, SAGE)
You get something like this Gene A Gene B Gene C Gene D Gene E Rep 1 5 16 10 3 1504 Rep 2 3 25 15 15 1005 Rep 3 12 35 3 8 1030 *Skewed distribution *Genes bounce around in replicates
Gene Ontology Enrichment • Ontology - An explicit formal specification of how to represent the objects, concepts and other entities • Gene Ontology- structured vocabulary used to describe aspects of the cell • Arranged in a hierarchy • Someone curates an ontology
Matlab example: Analyzing Gene Expression Profiles
Matlab example: Gene Ontology Enrichment in Microarray Data
Typical Steps in a Gene Expression Experiment • • Isolate RNA from several biological replicates (See slides 3 -4) Microarray – Label – Hybridize to microarray – Image microrray – Normalize data RNA-Seq – Sequence RNA – Align to genome – Count aligned reads – Normalize data between samples Analysis – independent of measurement technique – Call differential gene expression – Clustering – Gene Ontology enrichment
- Slides: 33