The Myth of Junk DNA Dr Raymond G



















- Slides: 19
The Myth of Junk DNA Dr. Raymond G. Bohlin Fellow, Discovery Institute Probe Ministries
Non-Protein Coding DNA 2001 – 65, 000 m. RNAs, but only 4% from exons 2002 – ENCODE found 11, 655 non-proteincoding RNAs 2005 – most of mammalian DNA is transcribed 2008 – both strands used in transcription and frequently from overlapping segments
Evolutionary predictions If a sequence is non-functional, then over time the sequence should degrade If a sequence is functional, then the sequence should be conserved by natural selection.
Non-Protein-coding DNA 2005 – non-coding regions in humans and mice, hundreds of nucleotides long are identical. Such ultra conserved regions (UCR) regulate developmentally important functions This is not expected by evolution!
Introns are not just inert spacers between exons 2005 – intronic sequence is highly conserved between humans, mice, rats, dogs and chickens – likely functional Mammalian thyroid receptor gene produces two variant proteins with opposite effects – splicing is regulated by an intron.
Co-expressed loci are clustered together along in the nucleus, sometimes to “create” genes Nuclear compartment with concentrated transcription factors Chromosome 5 loop Chromosome 21 loop Chromosome 2 loop
Pseudogenes A pseudogene is a gene that closely resembles a functional gene but appears to be a useless leftover Pseudogenes as defined above would be predicted by evolution but difficult under ID The human genome may have as many as 2000 pseudogenes
pseudogenes Some pseudogenes appear to suppress expression of the functional gene. The pseudogene can be transcribed and this transcript binds to the m. RNA sequence of the functional gene, thus blocking translation. “RNA interference” Transcribed pseudogenes serve as “perfect decoys” for RNA degrading enzymes, thus enhancing expression.
Repetitive Sequences About half of the mammalian genome consists of various types of repetitive sequences. Long Interspersed Nuclear Elements – LINEs Short Interspersed Nuclear Elements – SINEs Endogenous Retroviruses - ERVs
Overview of LINEs and SINEs have different structural arrangements. The major LINE in the human genome is the L 1. This sequence: Is found throughout Mammalia but is largely taxon-specific Is variously truncated at the 5’ end: ranges from 6 -8 kb to a few hundred bps in length Has a biased chromosomal distribution: AT-rich chromosome bands and the X-chromosome ORF 1 ORF 2: Reverse transcriptase and endonuclease G-dense Pu: Py element (A-rich ‘tail’) Species-specific regulatory region 3’ UTR (A-rich ‘tail’)
Chimp Human Chimp- vs. Human-Specific L 1 s* 0 L 1 Hs(Ta) elements 210 L 1 non. Ta elements 476 L 1 Pa 2 elements 271 L 1 Hs(Ta) elements 252 L 1 non. Ta elements 490 L 1 Pa 2 elements 5 -6 Million Years Ago *Mills, R. E. et al. 2006. Recently mobilized transposons in the human and chimpanzee genomes. Am. J. Hum. Genet. 78: 671 -679.
Remember the layout of a mammalian gene? Many human gene folders are bordered by species-specific repertoires of L 1 s. RNA outputs L 1 s “Gene” 2 “Gene” 1 “Gene” 4 “Gene” 3 “Gene” 5 L 1 s
Almost forty percent of human nuclear matrix attachment elements are L 1 sequences
Overview of SINEs The major SINE in the human genome is Alu. Unlike LINE-1, Alu (and other SINEs) do not encode enzymes for their mobilization. This sequence: Is primate-specific—subfamilies are distributed in a taxonomically hierarchical manner (same with LINE-1) Is ~300 bps in length; consists largely of two dimers (with sequence differences) Has a biased genomic distribution: GC-rich chromosome bands Central A-stretch (A-rich ‘tail’) Monomer A 31 bp insert Monomer B
Chimp Human Chimp- vs. Human-Specific SINEs* 233 other Alu elements 50 Alu. S elements 1167 other Alu elements 263 Alu. S elements 10 Alu. Ya 5 elements 1, 709 Alu. Ya 5 elements 9 Alu. Yb 8 elements 1, 290 Alu. Yb 8 elements 360 Alu. Y elements 484 Alu. Y elements 979 Alu. Yc 1 elements 356 Alu. Yc 1 elements 1 Alu. Yg 6 elements 261 Alu. Yg 6 elements 396 SVA (SINE) elements 864 SVA (SINE) elements 5 -6 Million Years Ago *Mills, R. E. et al. 2006. Recently mobilized transposons in the human and chimpanzee genomes. Am. J. Hum. Genet. 78: 671 -679.
Any seemingly random aspect of chromosome sequence arrangement is not. A case in point involves endogenous retroviruses (ERVs): A. Human ERVs contribute 51, 197 promoter elements that initiate transcription at various stages (Conley et al. , Bioinformatics 24: 1563 -1567, 2008). B. Mouse ERVs are highly expressed at the 2 -cell embryo stage (and are the earliest to be transcribed in the zygote) and are essential for ontogenesis (Kigami et al. , Biology of Reproduction 68: 651 -654, 2003).
ERVs In humans ERVs help regulate blood cell production and metabolizing fat ERVs also regulate gene expression in the gastrointestinal tract, mammary glands, and testes. The ERV derived protein syncitin is required for the fusion of fetal and maternal cells in the placenta.
Although less than 2% of genomic DNA in many vertebrates (e. g. , mammals) can be placed in the traditional “gene” category, nearly all sequences are transcribed in a cell - and tissue-specific manner.
DNA as Computer Information carried by DNA is bidirectional, multi-layered, and interleaved. Repetitive elements format and punctuate the information at different scales Cells can write codes onto non-coding DNA so phenotype is not always equal to genotype “metaprogramming” – Cornell Conf.