Prioritization of Avian GO Annotation Structural Annotation Genome

  • Slides: 10
Download presentation
Prioritization of Avian GO Annotation

Prioritization of Avian GO Annotation

Structural Annotation Genome No. Entrez Genes Species Build 2 Human Mouse Rat 1 Chicken

Structural Annotation Genome No. Entrez Genes Species Build 2 Human Mouse Rat 1 Chicken 36. 3 37. 1 3. 4 2. 1 36, 437 64, 018 49, 516 19, 9793 No. Proteins % predicted (NRPD) proteins 415, 830 228, 696 108, 069 31, 8193 4. 91 9. 28 29. 99 46. 624 proteins/gene 11. 41 3. 57 2. 18 1. 595 NRPD: Non-redundant Protein Database 1. The rat genome was published only 8 months prior to the chicken genome, yet rat has 2 x as many genes in Entrez Gene and 3 x as many proteins. 2. After two genome builds chicken still has 5% of genomic sequence that has not been assigned a chromosome and mini-chromosomes have not been sequenced. 3. Chicken genes and proteins are under-represented in public databases. 4. Of the chicken proteins available from NRPD, almost half are predicted based upon computational analysis. 5. On average chicken has only 1 protein per gene so very little is known about isoforms and alternate transcripts in the chicken gene products.

Phase 1: “Breadth” n 7, 478 Chicken entries in Uni. Prot. KB ¨ GOA

Phase 1: “Breadth” n 7, 478 Chicken entries in Uni. Prot. KB ¨ GOA provides IEA mapping for Uni. Prot. KB entries Initial strategy for Ag. Base biocurators was to add GO to chicken gene products that had none. n Since 46% of the chicken proteins in NRPD were predicted, they would have no GO n ¨ IEA, ISS, ISO….

Functional Annotation 100 80 % of gene products annotated no GO 60 computational GO

Functional Annotation 100 80 % of gene products annotated no GO 60 computational GO Ag. Base 40 manual GO 20 0 Human Mouse Rat Chicken the proportion of GO for chicken is over-represented because of their under-representation in public databases

Phase 2: “Depth”

Phase 2: “Depth”

What are the community needs?

What are the community needs?

GO Annotation of Arrays Del. Mar 14 K, FHCRC, Tgu array n 44 K

GO Annotation of Arrays Del. Mar 14 K, FHCRC, Tgu array n 44 K Agilent oligo array n AIIM array, Affymetrix n Should we be focusing on arrays? n What arrays should we do? n

GO Annotation Priorities? Provide “breadth” of coverage n Annotate products represented on arrays n

GO Annotation Priorities? Provide “breadth” of coverage n Annotate products represented on arrays n Reference Genome targets n Subject areas (immunity, nutrition/metabolism, development n Ad hoc as requested n