Korean Genome Sequencing and Analysis UNIST Jong Bhak
Korean Genome Sequencing and Analysis UNIST Jong Bhak 20161007 jongbhak@gmail. com
Acknowledgement • All the genomics related people in Korea – Dr. Kim Sungjin, Dr. Seo Jungsun, Dr. Yoo Hyangsook, Dr. Park Hong. Suk, Dr. Shin Hyungdoo, Dr. Lee Jongeon, Dr. Kim Hyungrae, Dr. Kim Sangsoo, … • • Biologists who register their data with us Government and public support Tax payers My great colleagues in KOBIC who are passionate, professional, and honest. • Gacheon Med. School. LCDi( Lee Gilya Cancer Diabetes Inst. ) Dr. Kim Sungjin, Dr. Ahn Sungmin • DDBJ NIG researchers and JCK training course organizers Jong Bhak, under openfree Bio. License
Bio. Disclaimer • The content of these slides are produced by Jong after shamelessly stealing other people’s copyrighted ideas and knowledge. • Everyone is welcome to take the slides in part or in whole without any permission whatsoever. • Everything here is under (♡) Bio. License
YH genome China (Nov. 2008)
Five Personal Genomes by 2008 • • • Craig Venter James Watson YH Chinese Yoruban (Anonymous, Nigeria, Africa) Kim Sung Jin, Korean Genome * Genomes data fully available in public sources Jong Bhak, under openfree Bio. License
Cost • • NCBI reference genome: 3, 000 million USD Craig Venter: 100 million USD James Watson: 1 million USD YH Chinese: 0. 5 million USD Nigerian African: 0. 25 million USD (Illumina) Kim Sung Jin: 0. 25 million USD Complete Genomics: 0. 005 million USD • 2010: 0. 001 million USD • 2012: $100 USD? Jong Bhak, under openfree Bio. License
$5, 000 genome 2009, Complete Genomics Inc.
Ome versus Omics graph $3, 000, 000 $50, 000 person Cost $ 0 2003 Ome and Omics Balance point 2016 Year Jong Bhak, under openfree Bio. License
Gen. Ome and Gen. Omics Balance point • It is time to do fusion research in the lab employing bioinformatists • It is time for companies to invest in informatics software • It is time for IT companies to expand to biotechnology business • It is time for educational institutes to teach programming in the specific biological fields • For example: – Genome vs Genomics $5, 000 sequencing at $5, 000 analysis cost(software, PC) – Proteome vs Proteomics $50, 000 mass spec at $50, 000 analysis cost Jong Bhak, under openfree Bio. License
Free Genomics & Public Genomics • Government pays the cost of Omics • The public pays the cost of Genomics • The UN helps poor countries Jong Bhak, under openfree Bio. License
The Korean Genome
Reference Genome • Reference Data • Reference Standard – http: //referencegenome. org • Each ethnic group needs it. • Korean Reference Genome Construction – KOBIC: Dr. Kim Sung. Jin Jong Bhak, under openfree Bio. License
The first Korean Genome (KOREF) • First analyzed by Gacheon medical school LCDI and KOBIC, KRIBB in 2008 • First annotated and made public on 4 th Dec. 2008 (through web and ftp) • To be used as the first National Reference Genome • SNP, CNV, indels were analysed • Automated phenotypic association study was done • Non-syn. Analysis • Phylogenetic study of mt. DNA, Y Chr And autosomes showed Korean relationship to Chinese and Japanese. • First intra-Asian genome comparison (Chinese and Korean) • Analyzed at: 7. 8, 17. 3, 23. 5 and 28 x folds • By Jan. 23. 5 fold sequenced analyzed • Openfreely Available from: http: //koreagenome. org
Korean Full Genome Statistics Number of reads 1, 532, 333, 844 Other statistics 67. 15 Giga base (67, 156, 812, 176) Total bases Number of mapped reads Mapped bases NCBI reference genomic coverage Sequence production coverage 1, 432, 388, 634 59. 8 GB pair (60, 260, 410, 000) 99. 89% 23. 5 X 99. 44% Agreement between genome sequencing and DNA chip homozy heteroz gous ygous 99. 81% 98. 18%
Variation and Variomics
Classification and number of intragenic SNPs
Comparison of SNPs among KOREF, Hu. Ref(Venter), and YH(China)
Ethnic Phylogeny
Pan Asia SNP
Homo- and heterozygous deletions in KOREF genome . (A) Homozygous 2. 3 kb genomic deletion and (B) Heterozygous 5 kb genomic deletion.
Personalized Medicine • For people, medicine is personalized. That is the very purpose of medicine (stethoscope) – Public health and Public genomics • Precise Medicine: Diagnose/treat disease more precisely? • Accuracy(te) Medicine? ? ?
Genome is a reference of omes Personal Referene Genome is necessary: Personal transcriptome, proteome, metabolome, epigenome, psychome, neuroma, behaviorome, …
- Slides: 24