RNAseq Tophat Mapping reads against reference genome considering

  • Slides: 8
Download presentation
RNA-seq

RNA-seq

Tophat • Mapping reads against reference genome considering splicing event • Build index •

Tophat • Mapping reads against reference genome considering splicing event • Build index • bowtie 2 -build reference. fa • Mapping • Tophat 2 –p <number_of_threads> -G <gff 3_file> –o <output_dir> <reference. fa> <fastq_file_1> <fastq_file_2>

Cufflinks • Make gtf • Cufflinks –p <number_of_threads> -o <cufflinks_out_dir> <tophat_out_dir/accepted_hits. bam> -g <gff_file>

Cufflinks • Make gtf • Cufflinks –p <number_of_threads> -o <cufflinks_out_dir> <tophat_out_dir/accepted_hits. bam> -g <gff_file>

Checking DEG • Using cufflinks packages • Common tools to identify DEG • edge.

Checking DEG • Using cufflinks packages • Common tools to identify DEG • edge. R • Same replications are needed • Identify DEG by more statistically significant method • Gfold • Give statistical result for experiment without replication or with different time of replications

Cufflinks packages • Cuffmerge – merging gtf • Locations of transcripts. gtf files derived

Cufflinks packages • Cuffmerge – merging gtf • Locations of transcripts. gtf files derived by cufflinks should be listed in assembly. txt • Cufflinks_out_dir 1/transcripts. gtf • Cufflinks_out_dir 2/transcripts. gtf • Cuffmerge –g <gff 3_file> -s <reference. fa> -p <number_of_threads> <assembly. txt>

Cufflinks packages • Cuffdiff – identify DEG • Cuffdiff –o <cuffdiff_out_dir> -b <reference. fa>

Cufflinks packages • Cuffdiff – identify DEG • Cuffdiff –o <cuffdiff_out_dir> -b <reference. fa> -p <number_of_threads> -L <label_of_bam_1, Label_of_bam_2, …. > -u <gtf_file(merged_asm/transcript. gtf> <tophat_out_dir/accepted_hits. bam_1> <tophat_out_dir/accepted_hits. bam_2>

edge. R • Htseq-count – for counting reads • python /data 2/htseq/python 2/scripts/htseq-count -t

edge. R • Htseq-count – for counting reads • python /data 2/htseq/python 2/scripts/htseq-count -t <feature_type> <sam_file> <gtf_file> • edge. R (R package) • library(edge. R) x<- read. delim("m 1. m 2. f 1. f 2. htseq. count", row. names="Symbol ") group <- factor(c("M", "F", "F")) y <- DGEList(counts=x, group=group) y <- calc. Norm. Factors(y) design <- model. matrix(~group) (If you have two trait, ~trait 1+trait 2) y <- estimate. Disp(y, design) fit <- glm. QLFit(y, design) qlf <- glm. QLFTest(fit) top. Tags(qlf) write. table(x=qlf$table, file="m 1. m 2. f 1. f 2. htseq. count. edge. R")

Gfold • Read count • gfold count –ann <gtf_file> -tag <samfile> -o sample 1.

Gfold • Read count • gfold count –ann <gtf_file> -tag <samfile> -o sample 1. read_cnt • gfold diff –s 1 sample 1 –s 2 sample 2 –suf <suffix> -o <outfile. name> • Ex ) • gfold diff -s 1 sample 1, sample 2, sample 3 -s 2 sample 4, sample 5, sample 6 -suf. read_cnt -o 123 VS 456. diff • gfold diff -s 1 sample 1, sample 2 -s 2 sample 3 -suf. read_cnt -o 12 VS 3. diff