PresenceAbsence Variation PAVs in Maize Phenotypic Variation Heterosis
Presence/Absence Variation (PAVs) in Maize: Phenotypic Variation, Heterosis, and Domestication (with some final comments about cassava) GCP 21 -II Kampala, Uganda 22 June 2012 Patrick S. Schnable Iowa State University China Agriculture University Data 2 Bio, LLC 1
The $30 M B 73 Maize Genome Sequencing Project Schnable, Ware et al. , 2009 The Maize Genome Sequencing Project, Rick Wilson, PI 2
Genome Projects are Analogous to the Lewis & Clark Expedition • Expensive and require extensive planning/coordination • Generates lots of information that requires subsequent a • Exploration of the unknown; expect surprises
Outline • CNV and Presence-Absence Variation (PAVs) • The origin of “recurrent de novo CNV” • Revisit the domestication bottleneck in light of SV • Relevance to cassava? 4
Outline • CNV and Presence-Absence Variation (PAVs) • The origin of “recurrent de novo CNV” • Revisit the domestication bottleneck in light of SV Kai Ying Yan Fu • Relevance to cassava? � 开 傅延 5
Structural Variation (CNV & PAV) CNV PAV • What is overall level of (genic) SV? • Does SV contribute to phenotypic diversity?
Array-based Comparative Genome Hybridization (CGH) • Nimblegen’s HD 2 Array (~2. 1 M probes) • Probes designed using a “frequency masked” 200 bp tile-path through the draft B 73 genome sequence • Genotypes: B 73, Mo 17 (different heterotic groups) Springer et al. , Plo. S Genetics, 2009
Several hundred intact, expressed, phylogenetically conserved genes exhibit CNVs and PAVs Segmentation Results Springer et al. , Plo. S Genetics, 2009 Beló A et al. Theor Appl Genet. (2010) (Rafalski Lab)
2 Mb deletion on Ch 6* *Includes ~2 dozen genes, incl. resistance gene (Xu Mingliang, 徐明良); this large deletion also identified by Antoni Rafalski (CGH) and Ed Buckler (Re-Seq)
CNV and PAV Loss (blue) & CNV Gain (red) Intervals relative to B 73 Outer to Inner rings: Teosinte vs. B 73 Tx 303 vs. B 73 Hp 301 vs. B 73 Mo 17 vs. B 73 ~10, 000 YBP
Re-sequencing Six Inbreds Identified PAVs 5. 4 X coverage/inbred ~150 Genes Present among Six Inbreds are Missing from B 73 Lai et al. , 2010
Classical Models for Heterosis Complementation AA bb aa BB Over-dominance Aa Bb x Zamir Complementation of PAVs in pairs of inbreds could contribute to heterosis; PAVs could also play a role in over-dominance
Deletions can be favorable • Removal of traits lost during domestication • Ion uptake machinery (Heavy metal resistance in wheat) • Cyanide release (chemical defense) in white clover (Olsen et al. , 2007; 2008) • Favorable rice QTL are in some cases PAVs: – q. PE 9 -1, panicle erectness; Zhou et al. (2009) Genetics – semi-dwarf 1, height; Ashikari et al. (2005) Science – GW 5, grain width; Weng et al. (2008) Cell Res 13
Outline • CNV and Presence-Absence Variation (PAVs) • The origin of “recurrent de novo CNV” • Revisit the domestication bottleneck in light of SV Sanzhen Liu • Relevance to cassava? 刘三震 14
Detection of De Novo CNV in Human Trios Mom (no CNV) X Dad (no CNV) “Kid” (de novo CNV) Detection of Recurrent De Novo CNV in Human Trios Mom (no CNV) X Dad (no CNV) “Kid” (de novo CNV) 15
Novel CGH Patterns Liu et al. , Plant J, 2012 16
Reciprocal Gene Loss Model to explain speciation • Lynch and Force, 2000 17
Segregation of Non-Allelic Homologs (SNH) Generates “Recurrent De Novo CNV” Model Predicts: Changes in “gene complement” among RILs (gains and losses) Should affect multiple RILs Affected genes should have non-allelic positions in B 73 and Mo 17 18
Segregation of Non-Allelic Homologs Generates “Recurrent De Novo CNV” and Novel Phenotypes • Consistent with model: • Losses and gains in gene content validated by Seq. Capture experiments (~200 segments in 2 RILs) • Specific losses (as detected by PCR) observed in multiple RILs (12/14 genes lost in 25% or 12. 5% of RILs) • Affected genes are in nonallelic positions in the B 73 and Mo 17 genomes • Inbreds can have different gene complements than parents • Strong statistical support for association between gene loss and yield component traits in 19 IBM RILs
Association of Gene Loss with Traits • Losses of 2/14 (14. 3%) tested segments are significantly associated with phenotypic variation: – Reduced yield component traits (adjusted pvalues=0. 03 and 0. 01). – Increased tiller number (adjusted p-value=0. 01). • This rate (14. 3%) is substantially higher than the 0. 1% (N=670) of 515, 620 control pairs of unlinked SNP markers that similarly exhibit associations with the same set of traits, identified via a two-dimension genome-wide scanning using a set of 1, 016 SNP markers
Outline • CNV and Presence-Absence Variation (PAVs) • The origin of “recurrent de novo CNV” • Revisit the domestication bottleneck in light of SV • Relevance to cassava? Kai Ying Camile �开 Rustenholz 21
Teosinte, the wild ancestor of maize Teosinte Maize Zea mays sp mays Domestication ~ 10, 000 years ago Zea mays sp parviglumis 22
Genes Selected During Domestication Have a Molecular Signature Yamasaki, M. , et al. Plant Cell 2005; 17: 2859 -2872 23 Copyright © 2005 American Society of Plant Biologists
CNV and PAV Loss (blue) & CNV Gain (red) Intervals relative to B 73 Outer to Inner rings: Teosinte vs. B 73 Tx 303 vs. B 73 Hp 301 vs. B 73 Mo 17 vs. B 73 ~10, 000 YBP 24
Hypothesis: maize lacks some teosinte genes B 73 Mo 17 Tx 303 CML 277 Oh 7 B Maize inbreds Domestication and crop improvement Teo. 1 Teo. 2 Teo. 3 Teo. 4 Teo. 5 Teosinte Genes conserved in maize and teosinte Non-B 73 genes Non-maize genes 25
1, 000 s of expressed genes in teosinte are missing from the B 73 reference genome B 73 × Teosinte Ac 3662 B 73 × Teosinte Ac 3660 F 1 teosinte Ac 3662 RNA-Seq 190 M reads, 14 Gb ABy. SS assembly ≥ 300 bp 63, 464 contigs Alignment against B 73 reference genome v 2 (≥ 90% identity, ≥ 50% coverage) 59, 690 contigs aligned with B 73 reference genome 3, 774 contigs NOT aligned with B 73 reference genome 26
Extensive validation (sequence capture & CGH) identified 72 expressed teosinte genes that are absent from all tested (N=92) maize genomes n tio a z idi br Hy esis th Syn Log rat io ≥ 1 Number of genes 2, 836 potential PAVs 92 diverse maize lines Genotyping CGH array Number of maize lines where the gene is present 27
Map locations of T+ M- PAVs N = 60/72 26 isolated genes 11 clusters of 2 genes or more 28
Presence rate of 72 T+ M- PAVs in 91 teosinte accessions 100% % of teosinte accessions Genotyping teosinte diversity panel via PCR 90% 80% 70% 60% Absent 50% Questionable Uncertain 40% Présent 30% 20% 10% 0% Low presence rate (≤ 50%) High presence rate (≥ 75%)
Random model – low frequency PAVs 16/72 (22%) of T+M- PAVs are present in <50% of tested teosintes. Genetic bottleneck Teosinte Low frequency T+M- PAVs could be lost via random processes, such as drift. Drift or other random mechanisms Maize
Direct Selection Against T+M- Genes Direct 49/72 (68%) T+M- PAVs are present in more than 75% of the tested teosintes Selection (direct or indirect) can explain the loss of high frequency Genetic bottleneck Teosinte Selection AGAINST a PAV; multiple haplotypes in maize possible (depending on LD in teosinte) Maize
Indirect Selection Against T+M- Genes 49 T+M- PAVs (68%) are present in more than 75% of the tested teosintes Indirect Selective sweep Teosinte Selection (direct Selection FOR a or indirect) can domestication allele explain the loss in LD and coupling with the absence of a of high PAV indirectly selects against the PAV frequency T+M- 27/49 highfrequency PAVs (55%) co-localize with low diversity regions Maize
Summary (Part I) • Maize haplotypes exhibit extensive SV (CNV and PAVs) that affects several hundred genes (supported by CGH, PCR, and re-sequencing results: both WGS and exome capture) • SV provides a testable hypothesis for heterosis (potentially making heterosis more predictive) • SV may help explain extraordinary level of phenotypic diversity in maize. CNVs and PAVs that are not in LD with SNPs could contribute to some of “missing heritability” in GWAS experiments. • “Recurrent de novo CNVs” can arise via meiotic segregation (SNH Model), yielding non-parental gene complements that have phenotypic consequences (transgressive segregation? )
Summary (Part II) • It is widely accepted that allelic diversity is reduced by domestication. We now know that not only alleles but entire genes can be lost during domestication • ~2, 000 expressed genes present in teosinte are missing from the B 73 genome. 72 of these genes are missing from all other tested maize lines. • Teosinte genes failed to pass through the domestication bottleneck for a variety of reasons (selection for or against haplotypes and random processes). • Teosinte genes that were lost inadvertently during domestication, may have potential in crop 34 improvement (e. g. , biotic and abiotic stress
Outline • CNV and Presence-Absence Variation (PAVs) • The origin of “recurrent de novo CNV” • Revisit the domestication bottleneck in light of SV • Relevance to cassava? 35
What About Cassava? • Interesting questions: – How common are PAVs among cassava CVs? (Steve Rounsley) – Do wild relatives of cassava contain genes that are absent from breeding germplasm? (Wenquan Wang) – Do these missing genes confer agronomically relevant traits? (e. g. , resistance to biotic or abiotic stresses) (implications for positional cloning experiments) – Does complementation of key PAVs contribute to 36 heterosis in cassava? (Ismail Rabbi)
Srinivas Aluru Dan Nettleton Nathan Springer Jeffrey Jeddeloh Jinsheng Lai Collaborators Brad Barbazuk The Maize Genome Sequencing Project Mike Scanlon (PI, Cornell) Jianming Yu, M. Timmermans, G. Muehlbauer, 38 D. Jannick-Buckner
Sanzhen Liu 刘三震 Kai Ying �开 Camile Rustenholz Yan Fu 傅延 Wei Wu 吳薇 CHINA AGRICULTURAL UNIVERSITY 39
- Slides: 38