Community Annotation and Bio Wikis Dan Bolser dmbbioinformatics

  • Slides: 28
Download presentation
Community Annotation and Bio. Wikis Dan Bolser (dmb@bioinformatics. org) Bioinformatics to Systems Biology October

Community Annotation and Bio. Wikis Dan Bolser ([email protected] org) Bioinformatics to Systems Biology October 2010 1

Presentation overview Community annotation Bio. Wikis Why is it necessary? The Wiki Web! When

Presentation overview Community annotation Bio. Wikis Why is it necessary? The Wiki Web! When does it work? Game mechanics? 2

3

3

Community Annotation Has been driven by two key factors: The vast increase in biological

Community Annotation Has been driven by two key factors: The vast increase in biological data The clear success of Wikipedia 4

Bio. Moore's Law Over time: Cost per unit of information can be decreased by

Bio. Moore's Law Over time: Cost per unit of information can be decreased by orders of magnitude. Throughput is increased by orders of magnitude. Fan et al. 2006. Nat Rev Genet. Comprehensive disease studies that might require ~1 bn genotypes would now cost only a few million dollars. Revolution in human genetics. 5

Community Annotation Centralised databases can't cope with annotating the influx of data. Less investment

Community Annotation Centralised databases can't cope with annotating the influx of data. Less investment in more specialised data. Fewer people with a stake. Specialists more disparate. Communities are smaller and more focused. Do wikis hold the answer? Wikipedia as a model… 6

The Success of Wikipedia is consistently among one of the top 10 websites in

The Success of Wikipedia is consistently among one of the top 10 websites in the world (http: //www. alexa. com). Google > Facebook > You. Tube > Yahoo! > Windows Live > Baidu > Wikipedia >. . . 200 k edits per day. 100 k active users per month. Wiki. Project Molecular and Cellular Biology 7

8

8

Why Wikipedia isn’t always the answer • Wikipedia is an educational resource. – All

Why Wikipedia isn’t always the answer • Wikipedia is an educational resource. – All articles are encyclopaedic in style. – Explicitly forbids data from ‘original research’: • http: //wikipedia. org/wiki/Wikipedia: No_original_research – Wikipedia does not publish original research. – No tools for analysis, presentation, or collection of ‘biological’ data. • Bio. Wikis! 9

Bio. Wikis with a biological subject matter, customized for analysis, presentation and collection of

Bio. Wikis with a biological subject matter, customized for analysis, presentation and collection of specific biological data and biological data types: 10

Some examples • Wiki. Pathways – Adds specific pathway creation and editing tools to

Some examples • Wiki. Pathways – Adds specific pathway creation and editing tools to the wiki. – Data is exported in standard formats via APIs – Pico AR, Kelder T, van Iersel MP, Hanspers K, Conklin BR, Evelo C. (2008) Wiki. Pathways: Pathway Editing for the People. PLo. S Biol 6(7): doi: 10. 1371/journal. pbio. 0060184 – http: //www. wikipathways. org 11

Some examples • Wiki. Opener Media. Wiki extension – Adds tools like BLAST to

Some examples • Wiki. Opener Media. Wiki extension – Adds tools like BLAST to the wiki – One of many ‘data extraction extensions’ • http: //www. mediawiki. org/wiki/Category: Data_extraction_extensions – Brohée S, Barriot R, Moreau Y. (2010) Biological Wikis: combining wikis with databases. Bioinformatics. 26(17): 2210 – http: //www. mediawiki. org/wiki/Extension: Wiki. Opener 12

Some examples • PDBWiki – Allows the protein structures in the PDB to be

Some examples • PDBWiki – Allows the protein structures in the PDB to be tagged with specific annotations. – Functions as a bug tracker for users of the PDB – Stehr H, Duarte JM, Lappe M, Bhak J, Bolser DM. (2010) PDBWiki: added value through community annotation of the Protein Data Bank. Database. baq 009 – http: //pdbwiki. org 13

Semantic Media. Wiki • Very powerful and generic Media. Wiki extension. – Users can

Semantic Media. Wiki • Very powerful and generic Media. Wiki extension. – Users can contribute structured data via forms using auto-completion. – Contributed data can be visualized in a variety of ways. – Data can be queried and reports produced. • All done within the wiki. • Data is ‘linked’… 14

15

15

When does it work? 16

When does it work? 16

17

17

When does it work? The barrier to annotation is low. The annotation provides direct

When does it work? The barrier to annotation is low. The annotation provides direct benefit to the user: Functionality Self-promotion Recognition Infrastructure Ease of use Provenance These factors often depend on COMMUNITY. 18

Building a community. . . Activation energy! You have to build up a resource

Building a community. . . Activation energy! You have to build up a resource before users will contribute! Kittur et. al. (2007) Power of the few vs. wisdom of the crowd. http: //www. parc. com/publication/ 1749/power-of-the-few-vswisdom-of-the-crowd. html 19

Game mechanics? (Fun) • Crowd sourcing – Using ‘the crowd’ to do useful work

Game mechanics? (Fun) • Crowd sourcing – Using ‘the crowd’ to do useful work • Game mechanics – Applying Game Mechanics to Functional Software – http: //www. youtube. com/watch? v=ih. Ut-163 g. ZI • Ease of use, robust infrastructure, and recognition of user contributions are encapsulated by the simple idea of making the site ‘fun’. 20

Recognition • People work for recognition. – In science, this typically comes from publication

Recognition • People work for recognition. – In science, this typically comes from publication of peer-reviewed papers. – Why contribute to a wiki? • Perhaps this will get you a publication? • Peer review is not just about papers. – Contributors to Wikipedia are recognised among their peers! 21

Recognition • Alternative models of recognition. – Wiki edits are unlikely to impress anyone

Recognition • Alternative models of recognition. – Wiki edits are unlikely to impress anyone on a CV, however… – Community mailing lists are a great way to network. • http: //biodatabase. org/index. php/List_of_mailing_lists_for_biologists – Recognition can come from contribution to community projects! • http: //bioinformatics. org/wiki !!! 22

23

23

Conclusions The wiki concept is a simple improvement on the original concept of the

Conclusions The wiki concept is a simple improvement on the original concept of the web. Sharing data. Bio. Wikis must be fun and attractive for users. Structured wikis promise to change our idea of a ‘web database’. Read only databases will be hard to imagine. 24

Get involved! • BBB mailing list • IRC: – irc: //irc. freenode. net/#bioinformatics –

Get involved! • BBB mailing list • IRC: – irc: //irc. freenode. net/#bioinformatics – irc: //irc. freenode. net/#semantic-mediawiki • Wikis! – Wikipedia – Bioinformatics. Org • Email me! 25

Acknowledgements Bifx. Org Directors Prash, Jeff, . . . All the contributors to http:

Acknowledgements Bifx. Org Directors Prash, Jeff, . . . All the contributors to http: //bifx. org/wiki irc: //irc. freenode. net/# bioinformatics Jeff, Cody, D. Hamel, Prash, Sonny, Chris, B Fristensky, Nagpal, Mariap 3636, Pingou, . . . Linus Torvalds for Linux, and all scientists who pursue their work with honesty and integrity. Henning Stehr and Jose Duarte for PDBWiki 26

References Wikinomics: http: //www. ncbi. nlm. nih. gov/pubmed/18769412 Ecoli. Wiki / Gene Wiki /

References Wikinomics: http: //www. ncbi. nlm. nih. gov/pubmed/18769412 Ecoli. Wiki / Gene Wiki / Open. Wet. Ware / PDBWiki / Proteopedia / Wiki. Genes / Wiki. Pathways / … http: //biodatabase. org/index. php/Bio. Wiki Bioinformatics. Org wiki: http: //bifx. org/wiki The SEQanswers wiki: http: //SEQwiki. org MCB: http: //wikipedia. org/wiki/Wikipedia: Project_MCB Bi. O Sites: http: //Bi. O. CC 27

References • See references within: – http: //www. ncbi. nlm. nih. gov/pubmed/20624717 – http:

References • See references within: – http: //www. ncbi. nlm. nih. gov/pubmed/20624717 – http: //www. ncbi. nlm. nih. gov/pubmed/20193066 – http: //www. ncbi. nlm. nih. gov/pubmed/18613750 • Semantic Media. Wiki: – http: //semantic-mediawiki. org – irc: //irc. freenode. net/#semantic-mediawiki 28