Digital Libraries 1991 2006 and beyond with Electronic
Digital Libraries: 1991 -2006 and beyond, with Electronic Theses and Dissertations University of Sao Paulo, Brazil 23 August 2006 Edward A. Fox, fox@vt. edu, http: //fox. cs. vt. edu Professor, Department of Computer Science Director, Digital Library Research Laboratory Virginia Tech, Blacksburg, VA 26061 USA USP, Brazil, August 2006
Acknowledgements • • • Students Faculty, Staff Collaborators Support Mentors USP, Brazil, August 2006 2
Acknowledgements: Students • Pavel Calado, Yuxin Chen, Fernando Das Neves, Shahrooz Feizabadi, Robert France, Marcos Gonçalves, Doug Gorton, Nithiwat Kampanya, Rohit Kelapure, S. H. Kim, Neil Kipp, Aaron Krowne, Bing Liu, Ming Luo, Paul Mather, Uma Murthy, Fernando Das Neves, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ohm Sornil, Hussein Suleman, Ricardo Torres, Srinivas Vemuri, Wensi Xi, Seungwon Yang, Xiaoyan Yu, Baoping Zhang, Qinwei Zhu, … USP, Brazil, August 2006 3
Acknowledgements: Faculty, Staff • Lillian Cassel, Debra Dudley, Roger Ehrich, Joanne Eustis, Weiguo Fan, James Flanagan, C. Lee Giles, Eberhard Hilf, John Impagliazzo, Filip Jagodzinski, Douglas Knight, Deborah Knox, Alberto Laender, Gail Mc. Millan, Claudia Medeiros, Manuel Perez-Quinones, Naren Ramakrishnan, Layne Watson, … USP, Brazil, August 2006 4
Other Collaborators (Selected) • • • Brazil: FUA, UFMG, UNICAMP, USP Case Western Reserve University Emory, Notre Dame, Oregon State Germany: Univ. Oldenburg Mexico: UDLA (Puebla), Monterrey College of NJ, Hofstra, Penn State, Villanova University of Arizona University of Florida, Univ. of Illinois University of Virginia VTLS (slides, services for NDLTD) USP, Brazil, August 2006 5
Acknowledgements: Support • Course: UNESCO, CETREDE, IFLA-LAC, AUGM, CLEI, UFC • Sponsors: ACM, Adobe, AOL, CAPES, CNI, CONACy. T, DFG, IBM, Microsoft, NASA, NDLTD, NLM, NSF (IIS-9986089, 0086227, 0080748, 0325579; ITR-0325579; DUE-0121679, 0136690, 0121741, 0333601), OCLC, SOLINET, SUN, SURA, UNESCO, US Dept. Ed. (FIPSE), VTLS
Acknowledgements - Mentors • JCR Licklider – undergrad advisor (1969 -71) – Author in 1965 of “Libraries of the Future” – Before, at ARPA, funded start of Internet • Michael Kessler – BS thesis advisor – Project TIP (technical information project) – Defined bibliographic coupling • Gerard Salton – graduate advisor (1978 -83) – “Father of Information Retrieval” USP, Brazil, August 2006 7
Overview • Digital Libraries: Sources, Chatham, Rome, … • Curriculum Development Project: 5 S • NDLTD (Networked Digital Library of Theses and Dissertations) • Conclusions • Challenges USP, Brazil, August 2006 8
Libraries of the Future JCR Licklider, 1965, MIT Press World Nation State City Community USP, Brazil, August 2006 9
Communications (bandwidth, connectivity) Locating Digital Libraries in Computing and Communications Technology Space Digital Libraries technology trajectory: intellectual access to globally distributed information Computing (flops) Digital content less more Note: we should consider 4 dimensions: computing, communications, content, and community (people)
Information Life Cycle Authoring Modifying Using Creating Retention / Mining Organizing Indexing Accessing Filtering Storing Retrieving Distributing Networking USP, Brazil, August 2006 11
Sources For More Information • Magazine: www. dlib. org • Books: http: //fox. cs. vt. edu/DLSB. html (1994, covering since 1991) – MIT Press: Arms, plus by Borgman, Licklider (1965) – Morgan Kaufmann: Witten. . . (several), Lesk (2 nd edition) • Conferences – ECDL: www. ecdl 2005. org – ICADL: www. icadl. org – JCDL: www. jcdl 2006. org • Associations – ASIS&T DL SIG – IEEE TCDL: www. ieee-tcdl. org (student awards, doctoral consortia) • NSF: www. dli 2. nsf. gov • Labs: VT: www. dlib. vt. edu USP, Brazil, August 2006 12
DL Terminology • Digital / electronic / virtual library • Born digital, hybrid (digital/physical) • Universal access (all people/places/times) – Accommodate disabilities (color, visual, auditory) – Mobile (office, home, laptop, PDA, mobile) • Archiving, self-archiving • Open (source, standards, archives) USP, Brazil, August 2006 13
Digital Libraries Shorten the Chain from Editor Reviewer Publisher A&I Consolidator Library USP, Brazil, August 2006 14
DLs Shorten the Chain to Author Teacher Digital Reader Editor Reviewer Learner Library Librarian USP, Brazil, August 2006 15
R e a g a n M o o r e E d F o x Application Domain Related Institutions Examples Technical Challenges Benefit / Impact Publishing Publishers, Eprint archives OAI Quality control, openness Aggregation, organization Education Schools, colleges, universities NSDL, NCSTRL Knowledge management, reuseability Access to data Art, Culture Museum AMICO, PRDLA Digitization, describing, cataloging Global understanding Science Government, Academia, Commerce NVO, PDG, Swiss. Prot, UK e. Science, European Union Commission Data models reproducibility, faster reuse, faster advance (e) Government Agencies (all levels) Census Intellectual property rights, privacy, multi-national Accountability, homeland security (e) Commerce, (e) Industry Legal institutions Court cases, patents Developing standards Standardization, economic development History, Heritage Foundations Cross-cutting Library, Archive American Memory Web, personal collections Content, context, interpretation Multi-language, preservation, scalability, interoperability, dynamic behavior, workflow, sustainability, ontologies, distributed data, infrastructure USP, Brazil, August 2006 Long term view, perspective, documentation, recording, facilitating, interpretation, understanding Reduced cost, increased access, pereservation, democratization, leveling, peace, competitiveness J u n e 2 0 0 2 f o r N S 16 F
Motivation for Theory, Curriculum • Digital Libraries (DLs): what are they? ? – No definitional consensus – Conflicting views – Makes interoperability a hard problem • DLs are not benefiting from formal theories as are other CS fields: DB, IR, PL, etc. • DL construction: difficult, ad-hoc, lack of support for tailoring/customization • Conceptual modeling, requirements analysis, and methodological approaches are rarely supported in DL development. – Lack of specific DL models, formalisms, languages USP, Brazil, August 2006 17
DL Definitions - 1 • “A digital library is an organized and focused collection of digital objects, including text, images, video, and audio, along with methods of access and retrieval, and for selection, creation, organization, maintenance, and sharing of the collection. ” • Witten & Bainbridge – “How to Build a Digital Library” – Morgan Kaufmann 2003 USP, Brazil, August 2006 18
DL Definitions - 2 • “Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities” • Waters, D. J. CLIR Issues, July/August 1998 • www. clir. org/pubs/issues 04. html USP, Brazil, August 2006 19
DL Definitions - 3 • Issues and Spectra – Collection vs. Institution – Content vs. System – Access vs. Preservation – “Free” vs. Quality – Managed vs. Comprehensive – Centralized vs. Distributed USP, Brazil, August 2006 20
DL Definitions - 4 • NOT a “digitized library” • NOT a “deconstruction” of existing systems and institutions, moving them to an electronic box in a Library • IS a new way to deal with knowledge – Authoring, Self-archiving, Collecting, – Organizing, Preserving, – Accessing, Propagating, Re-using USP, Brazil, August 2006 21
USP, Brazil, August 2006 22
USP, Brazil, August 2006 23
USP, Brazil, August 2006 24
People • • Digital librarians DL system developers DL system administrators DL managers DL collection development staff DL evaluators DL users USP, Brazil, August 2006 25
DL Manifesto - 1 • DL Reference Model • In support of the future European Digital Library • Developed by team connected with DELOS (Candela, Casteli, Ioannidis, Koutrica, Meghini, Pagano, Ross, Schek, Schuldt) • Draft 2. 2 presented in Frescati, near Rome, June 2006 – 79 pages • Could be integrated with work of DLF, JISC, etc. USP, Brazil, August 2006 26
DL Manifesto – 2: 3 Tiers USP, Brazil, August 2006 27
DL Manifesto – 3: Main Concepts USP, Brazil, August 2006 28
DL Manifesto – 4: Actor Roles USP, Brazil, August 2006 29
Curriculum Development Project • • Collaborative Research launched by: – Department of Computer Science, Virginia Tech – School of Information and Library Science, University of North Carolina, Chapel Hill Three year (2006 - 2008) funded project USP, Brazil, August 2006 30
Project Teams/NSF Grant • Project Team at VT (IIS-0535057): – PI: Dr. Edward A. Fox (fox@vt. edu) – GRA: Seungwon Yang (seungwon@vt. edu) • Project Team at UNC-CH (IIS-0535060): – Co-PI: Dr. Barbara Wildemuth (wildem@ils. unc. edu) – Co-PI: Dr. Jeffrey Pomerantz (pomerantz@unc. edu) – GRA: Sanghee Oh (shoh@email. unc. edu) USP, Brazil, August 2006 31
Project Links • Homepage http: //curric. dlib. vt. edu/DLcurric. html - Overview, proposal, progress diary, news & interviews, contact information • Wiki http: //curric. dlib. vt. edu/wiki - Resources will be added here - Coming soon USP, Brazil, August 2006 32
What We Do: • Identify, develop and test educational DL modules, guided by - Experts and international collaborators - Computing Curriculum 2001 - 5 S framework - Analysis of DL course syllabi - Development of module template USP, Brazil, August 2006 33
Taxonomy of DL Educational Resources USP, Brazil, August 2006 34
Module Template (Draft) 1. Module name 2. Learning objectives 3. Level of effort required (in hours, for students, teachers) 4. Prereq knowledge required 5. Remedial materials 6. 5 S characterization 7. Relationships with other modules and module topics 8. Resources (books, …) 9. Body of knowledge • Topical outline • with resources in context • Theory and practice • Learning activities • Presentation materials 10. Concept maps 11. Exercises/learning activities 12. Evaluation of learning outcomes 13. Glossary 14. Useful links USP, Brazil, August 2006 35
CC 2001 Information Management Areas IM 1. Information models and systems* IM 8. Distributed DBs IM 2. Database systems* IM 9. Physical DB design IM 3. Data modeling* IM 10. Data mining IM 4. Relational DBs IM 11. Information storage and retrieval IM 5. Database query languages IM 12. Hypertext and hypermedia IM 6. Relational DB design IM 13. Multimedia information & systems IM 7. Transaction processing IM 14. Digital USP, Brazil, August 2006 libraries 36
How to organize a DL course? • Various frameworks – – – What, Why, How History, Current status, Future (research) Economics: open source, sustainability Social: users/patrons, management Technical: HCI, HT, IR, LIS, Web • So, we should see what is discussed • And, we should generalize, so we have a stable framework that is intuitive and formally based USP, Brazil, August 2006 37
5 S Framework • Developed at Digital Library Research Laboratory (DLRL, Virginia Tech ) • Strong foundation for DL module development – Intuitive as well as formal definitions • Base ideas named with five S’s - streams, structures, spaces, scenarios, societies • Key aspects of DLs precisely defined using one or more of the Ss • Set of metamodels for classes of DLs: minimal, archeological (ETANA), practical, European DL, … USP, Brazil, August 2006 38
Informal 5 S & DL Definitions DLs are complex systems that • • • help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams) USP, Brazil, August 2006 39
5 S Examples USP, Brazil, August 2006 40
5 S Hypotheses • A formal theory for DLs can be built based on 5 S. • The formalization can serve as a basis for modeling and building high-quality DLs. USP, Brazil, August 2006 41
5 S and DL formal definitions and compositions (April 2004 TOIS) USP, Brazil, August 2006 42
A Minimal DL in the 5 S Framework Streams Structured Stream Structures Spaces Structural Metadata Specification Scenarios Societies services Descriptive Metadata Specification indexing browsing searching hypertext Digital Object Collection Metadata Catalog Repository USP, Brazil, August 2006 Minimal DL 43
Services Taxonomy USP, Brazil, August 2006 44
Requirements (1) Analysis (2) 5 S Meta Model DL Expert component pool ODLSearch, ODLBrowse, ODLRate, ODLReview, ……. 5 SGraph 5 SL DL Model DL Designer Practitioner Teacher Design (3) Researcher 5 SLGen Tailored DL Services Implementation (4) 5 SSuite 5 SGraph 5 SGen Mapping Tool USP, Brazil, August 2006 45
USP, Brazil, August 2006 46
USP, Brazil, August 2006 47
DL Topics in 19 Modules (original) USP, Brazil, August 2006 48
Module Revision 3/27/06 STREAM 1. Collection Development – – – 2. Digitization Document and E-publishing Markup Harvesting Digital objects/Composites/Packages – – Text Resources Multimedia streams/structures, Captures/representation, Compression/coding • • Content-based analysis, Multimedia indexing Multimedia presentation rendering STRUCTURE 3. Metadata, Cataloging, Author submission – – 4. Thesauri, Ontologies, Classification, Categorization Bibliographic information, Bibliometrics, Citations Architecture (agents, buses, wrappers/mediators), Interoperability USP, Brazil, August 2006 49
Module Revision 3/27/06 SPACE 5. Spaces (conceptual, geographic, 2/3 D, VR) – – Storage Repositories, Archives SENARIOS 6. Services (searching, linking, browsing, etc. ) – – – Info needs, Relevance, Evaluation, Effectiveness Search & search strategy, Info seeking behavior, User modeling, Feedback Routing, Filtering, Community filtering Sharing, Networking, Interchange Info summarization, Visualization SOCIETIES 7. Intellectual property rights management, Privacy, Protection (watermarking) (ILS) 8. Social issues / Future DLs 9. Archiving and preservation integrity (ILS) USP, Brazil, August 2006 50
Ascertaining Priority Topics • We’ve manually classified analyzed publications using 9 -Modules (revised): Source Count Proceedings JCDL ’ 01 – ’ 05 354 Proceedings ACM DL ’ 96 – ’ 00 189 Magazine articles D-Lib ’ 95 – ‘ 06 521 Session titles JCDL, ACM DL, ECDL 264 USP, Brazil, August 2006 51
Distribution of Conference Papers across Module Topics USP, Brazil, August 2006 52
Distribution of D-Lib Magazine Articles across Module Topics USP, Brazil, August 2006 53
Distribution of Session Titles across Module Topics USP, Brazil, August 2006 54
Textbook on DLs • PI Fox, along with co-author Gonçalves, is preparing a textbook on DLs based on 5 S • This work will rely on the 5 S framework to ensure that it provides integrated coverage of the many concepts related to DLs • Fox and Gonçalves are focused on a book for teaching as well as reference USP, Brazil, August 2006 55
Textbook Outline • Ch. 1. Introduction (Motivation, Synopsis) • Part 1 – The “Ss” – Ch. 2: Streams – Ch. 3: Structures – Ch. 4: Spaces – Ch. 5: Scenarios – Ch. 6: Societies USP, Brazil, August 2006 56
Chapter 2 Overview • Multiple media types and representation – See ch. 4 for IR (except some here for non-text) – Standards for each, and for some combinations • Text – Character strings, encoding (Unicode) – Morphology -> Stemming – Syntax, semantics -> stop words • Images, Audio, Video, Graphics, Animation – Capture, digitization, representation – CBIR for each USP, Brazil, August 2006 57
Ch 3: Structures: Degrees of Web DLs DBs Chaotic Organized Structured USP, Brazil, August 2006 58
Digital Objects (DOs) • Born digital • Digitized version of “real” object – Is the DO version the same, better, or worse? – Decision for ETDs: structured + rendered • Surrogate for “real” object – Not covered explicitly in metamodel for a minimal DL – Crucial in metamodel for archaeology DL USP, Brazil, August 2006 59
Metadata Objects (MDOs) • • MARC Dublin Core RDF IMS OAI (Open Archives Initiative) Crosswalks, mappings Ontologies Topics maps, concept maps USP, Brazil, August 2006 60
Complex to Simple + thesis MARC ($50) USP, Brazil, August 2006 Dublin Core (DC) 61
Also : Epub, SGML, XML • • 5 S perspective: streams, structures, scenarios Authoring Rendering, presenting Tagging, Markup, DOM Semi-structured information Dual-publishing, e. Books Styles (XSL, XSLT) Structured queries USP, Brazil, August 2006 62
Textbook Outline • Part 2 – Higher DL Constructs – Ch. 7: Collections – Ch. 8: Catalogs – Ch. 9: Repositories and Archives – Ch. 10: Services – Ch. 11: Systems – Ch. 12: Case Studies USP, Brazil, August 2006 63
What is Fedora™? Flexible Extensible Digital Object Repository Architecture • Slides courtesy Vinod Chachra of VTLS USP, Brazil, August 2006 64
Fedora™ Digital Object Architecture Persistent ID (PID) Globally unique persistent id Public view: access methods for obtaining “disseminations” of digital object content Disseminators Internal view: metadata necessary to manage the object System Metadata Datastreams EAD, TEI, DC, MARC, VRA Core, MIX, etc. Images, E-books, E-journals, Music, Video, etc. Protected view: content that makes up the “basis” of the object The Mellon Fedora Project Adapted from Slide by V. Chachra, VTLS USP, Brazil, August 2006 65
Fedora™ Repository Web Service Exposure Layer Adapted from Slide by V. Chachra, VTLS USP, Brazil, August 2006 66
VITAL / Fedora Relationship USP, Brazil, August 2006 67
OAI - Open Archives Initiative • Advocacy for interoperability • Standard for transferring metadata among digital libraries – Protocol for Metadata Harvesting (PMH) • Simplicity • Generality • Extensibility • Support for PMH => Open Archive (OA) USP, Brazil, August 2006 68
OAI = Technical Umbrella for Practical Interoperability… Reference Libraries Museums Publishers E-Print Archives …that can be exploited by different communities USP, Brazil, August 2006 69
OAI – Black Box Perspective OA 7 OA 4 OA 2 OA 1 OA 3 OA 6 USP, Brazil, August 2006 OA 5 70
The World According to OAI Service Providers Discovery Current Awareness Preservation Metadata harvesting Data Providers USP, Brazil, August 2006 71
Textbook Outline • Part 3 – Advanced Topics – – Ch. 13: Quality Ch. 14: Integration Ch. 15: How to build a digital library Ch. 16: Research Challenges, Future Perspectives • Appendix – – – A: Mathematical preliminaries B: Formal Definitions: Ss C: Formal Definitions: DL terms, Minimal DL D: Formal Definitions: Archeological DL E: Glossary of terms, mappings USP, Brazil, August 2006 72
Quality and the Information Life Cycle USP, Brazil, August 2006 73
Ellis & Kuhlthau’s Models Mapped to Info. Life Cycle USP, Brazil, August 2006 74
USP, Brazil, August 2006 75
NDLTD • Networked Digital Library of Theses & Dissertations • Members – ~50 full members, ~200 associated members – International (Australia, Canada, China, Germany, India, Jamaica, Korea, South Africa, Sudan, Taiwan, Turkey, U. K. , U. S. A. , and many more) • Over 250 K metadata records in Union Catalo • URL http: //www. ndltd. org USP, Brazil, August 2006 76
NDLTD Goals • For Students: – Gain knowledge and skills for the Information Age, especially about Digital Libraries – Richer communication (digital information, multimedia, …) • For Universities: – Easy way to enter the digital library field and benefit thereby • For the World: – Global digital library – large, useful, many services USP, Brazil, August 2006 77
NDLTD Members - 1 Ball State University Government of Canada Brigham Young University Griffith University California Institute of Technology John Hopkins University Consorci de Biblioteques Universitàries de Catalunya Duke University Kauno Technologijos Universitetas Louisiana State University L'Université du Québec à Rimouski Georg August Universität Göttingen Mc. Gill University George Washington University Georgetown University Georgia Institute of Technology New Jersey Institute of Technology Georgia Southern University North Carolina Central University Georgia State University North Carolina State USP, Brazil, August 2006 Ohio University 78
NDLTD Members - 2 Oregon State University Library University of Missouri Penn State University of North Carolina - Chapel Hill Pontifícia Universidade Católica do Rio de Janeiro University of Pittsburgh Portugal National Library University of Pretoria Rita Chu (individual) University of Southern Florida Simon Fraser University of Tennessee State of Kansas University of Waterloo Texas Tech University Virginia Tech Universidad de las Américas, Puebla West Virginia University Libraries Universität St. Gallen Worcester Polytechnic Institute University of Glasgow Yale University of Maine USP, Brazil, August 2006 79
USP, Brazil, August 2006 80
USP, Brazil, August 2006 81
USP, Brazil, August 2006 82
USP, Brazil, August 2006 83
A Digital Library Case Study • Domain: graduate education, research • Genre: ETDs=electronic theses & dissertations • Submission: http: //etd. vt. edu • Collection: http: //www. theses. org Project: Networked Digital Library of Theses & Dissertations (NDLTD) http: //www. ndltd. org
How can a university get involved? • Select planning/implementation team – – Graduate School Library Computing / Information Technology Institutional Research / Educ. Tech. • Join online, give us contact names – www. ndltd. org/join • Adapt Virginia Tech or other proven approach – – Build interest and consensus Start trial / allow optional submission
Student Gets Committee Signatures and Submits ETD Signed Grad School
Library Catalogs ETD, Access is Opened to the New Research WWW NDLTD
http: //scholar. lib. vt. edu/theses/available/etd-2227102539751141/ USP, Brazil, August 2006 88
USP, Brazil, August 2006 89
USP, Brazil, August 2006 90
ETD Union Collection (OAI) USP, Brazil, August 2006 91
Union catalog: OCLC • OCLC will expand OAI data provider on TDs. • Is getting data from World. Cat (so, from many sites!). • Will harvest from all others who contact them. • Need DC and either ETD-MS or MARC. • Has a set for ETDs. USP, Brazil, August 2006 92
USP, Brazil, August 2006 93
USP, Brazil, August 2006 94
USP, Brazil, August 2006 95
OCLC SRU Interface USP, Brazil, August 2006 96
USP, Brazil, August 2006 97
ETD Union Search Mirror Site in China (CALIS) (http: //ndltd. calis. edu. cn – popular site!) USP, Brazil, August 2006 98
USP, Brazil, August 2006 99
VTLS Union Catalog Content Languages n The VTLS NDLTD Union Catalog has data in 6 different languages. These are: n English n German n Greek n Korean n Portuguese n Spanish n Examples follow USP, Brazil, August 2006 100
Language = German; hits = 137 USP, Brazil, August 2006 101
Full record display USP, Brazil, August 2006 102
USP, Brazil, August 2006 103
USP, Brazil, August 2006 104
USP, Brazil, August 2006 105
ETDs: Library Goals • Improve library services – Better turn-around time – Always available • Reduce work – catalog from e-text – eliminate handling: mailing to Pro. Quest, bindery prep, check-out, check-in, reshelving, etc. • Save space USP, Brazil, August 2006 106
What are we doing? • Aiding universities to enhance graduate education, publishing and IPR efforts • Helping improve the availability and content of theses and dissertations • Educating ALL future scholars so they can publish electronically and effectively use digital libraries (i. e. , are Information Literate and can be more expressive)
NDLTD Incorporation • Networked Digital Library of Theses and Dissertations incorporated May 20, 2003 in Virginia, USA • Charitable and educational purposes (501 c 3) • Officers – Executive Director (Ed Fox) – Secretary (Gail Mc. Millan) – Treasurer (Scott Eldredge) USP, Brazil, August 2006 108
Board of Directors (2006) • • • • Suzie Allard (ETD 2004, U. Kentucky) Denise A. D. Bedford (World Bank) Julia C. Blixrud (ARL, SPARC) José Luis Borbinha (Natl Lib Portugal) Alex Byrne (ETD 2005, ADT: Australia) Tony Cargnelutti (ETD 2005, Australia) Vinod Chachra (VTLS) William Clark (Ohio State U. ) Susan Copeland (RGU, UK) Jude Edminster (Bowling Green St. U. ) Scott Eldredge (Treasurer, ETD 2002, BYU) Edward A. Fox (Exec Director, Virginia Tech) John H. Hagen (West Virginia U. ) • Thomas B. Hickey (OCLC) • Christine Jewell (U. Waterloo, Canada) • • • Joan K. Lippincott (CNI) Mike Looney (Adobe) Austin Mc. Lean (Pro. Quest) Gail Mc. Millan (Secretary, Virginia Tech) Joseph Moxley (ETD 2000, USF) Eva Müller (U. Uppsala, Sweden) Ana Pavani (PUC Rio, Brazil) Sharon Reeves (Natl Library Canada) Peter Schirmbacher (ETD 2003, Humboldt) Hussein Suleman (U. Cape Town, S. Africa) Shalini R. Urs (U. Mysore, India) Eric F. Van de Velde (ETD 2001, Caltech) USP, Brazil, August 2006 109
NDLTD Committees (Chairs) • • • Awards (John Hagen) Conferences (Sharon Reeves) Development (Peter Schirmbacher) Executive (Edward Fox) Finance (Scott Eldredge) Implementation (Ana Pavani) Membership (Tony Cargnelutti) Nominating (Joan Lippincott) Standards (Thomas B. Hickey) Union Catalog (Vinod Chachra) USP, Brazil, August 2006 110
Selected Projects / Sponsors • • • Australia (ADT) Brazil (BDT, IBICT) Canada Catalunya Chile (Cybertesis) China (CALIS) Germany India (Vidyanidhi) Korea • • Ohio. LINK: 79 colleges/univs Portugal (National Library) South Africa UK (British Library, JISC, Edinburgh, …) • UNESCO (especially Latin America, Eastern Europe, Africa) • … USP, Brazil, August 2006 111
Some Countries • • • • • Australia Belgium Brazil Canada Chile China, Hong Kong Columbia Finland France Germany Greece India Italy Jamaica Korea Lithuania Malaysia Mexico USP, Brazil, August 2006 • • • • • Namibia Netherlands Norway Poland Russia Singapore S. Africa S. Korea Spain Sudan Sweden Switzerland Taiwan Thailand Turkey UK USA Venezuela Yugoslavia 112
UNESCO and ETDs (by Axel Plathe at ETD 2003) • Promoting the use of the Internet as a tool for disseminating scientific knowledge • Facilitating the transfer of ETD expertise from developed to developing countries • 1998: Member of the NDLTD Steering Committee • 1999: First UNESCO ETD meeting on ETD internationalisation • 2002: “UNESCO Guide to Electronic Theses and Dissertations” • 2003: Model training programmes and training courses • 2003: Sponsor pilot projects • 2003: Pilot projects (Africa, Europe, Latin-America) USP, Brazil, August 2006 113
Why ETD? Short Answer • For Students: – Gain knowledge and skills for the Information Age – Richer communication (digital information, multimedia, …) • For Universities: – Easy way to enter the digital library field and benefit thereby • For the World: – Global digital library – large, useful, many services • General: – Save time and money – Increased visibility for all associated with research results USP, Brazil, August 2006 114
Patrons, Queries • User Profile Data (Oct. 2005 – May 2006) • Online User Survey as part of User Modeling study • Total 1100 User Data that include – User survey: majors, specialties, years of experience, and demographic information. – Tracking Data: Queries and detailed research interests obtained by a Search User Interface embedded User Tracking System [4] USP, Brazil, August 2006 115
Categorization of Academic Subjects • Created our own classification categories • Based on colleges/faculties in five universities in VA - Virginia Tech, University of Virginia, George Mason University, VCU and Virginia State University • Identified - 7 categories and 77 subcategories - Word patterns for each subcategories USP, Brazil, August 2006 116
Categorization of Academic Subjects • 7 categories and selected 77 subcategories 7 Categories Selected 77 Sub-categories 1 Architecture and Design Architecture. Construction, Landscape. Architecture, … 2 Law 3 Medicine, Nursing and Veterinary Medicine Dentistry, Medicine, Pharmacy, Nursing, … 4 Arts and Science Agriculture, Animal. Poultry, Biology, . . . 5 Engineering and Applied Computer. Science, Material, Electronics, … Science 6 Business and Commerce Buisiness, Economics, Management, … 7 Education 8 Others (unclassifiable) Education USP, Brazil, August 2006 117
Supply-Demand Comparison 1 Architecture and Design 2 Law 3 Medicine, Nursing and Veterinary Medicine 4 Arts and Science 5 Engineering and Applied Science 6 Business and Commerce 7 Education 8 Others. (unclassifiable) USP, Brazil, August 2006 118
Measuring Supply – Demand • ETD Supply: - Number of resources provided - 242, 688 ETDs classified into 7 categories and counted • Patron’s Demand: - Number of queries entered - 4519 queries (in 1100 user data) classified into 7 categories - “Sum of all queries” in each category calculated as USP, Brazil, August 2006 119
Resource Distribution USP, Brazil, August 2006 1 Architecture and Design 2 Law 3 Medicine, Nursing and Veterinary Medicine 4 Arts and Science 5 Engineering and Applied Science 6 Business and Commerce 7 Education 8 Others. (unclassifiable) 120
User Distribution USP, Brazil, August 2006 1 Architecture and Design 2 Law 3 Medicine, Nursing and Veterinary Medicine 4 Arts and Science 5 Engineering and Applied Science 6 Business and Commerce 7 Education 8 Others. (unclassifiable) 121
Query Distribution 1 Architecture and Design 2 Law 3 Medicine, Nursing and Veterinary Medicine 4 Arts and Science 5 Engineering and Applied Science 6 Business and Commerce 7 Education 8 Others. (unclassifiable) USP, Brazil, August 2006 122
Supply-Demand of 77 Subcategories (1/2) USP, Brazil, August 2006 123
Supply-Demand of 77 Subcategories (2/2) USP, Brazil, August 2006 124
User Expertise Years USP, Brazil, August 2006 125
Expertise Years and Demand USP, Brazil, August 2006 126
Date Stamp of ETD USP, Brazil, August 2006 127
Future Work • Use of widely-used classification system - e. g. , Dewey Decimal Classification 22 ($375) • More detailed classification of ETDs - Include title, abstract and other subject field data - Approx. 7000 etds in oai_etdms as well as oai_dc Utilize “discipline” in oai_etdms format records • Use of user activity data - e. g. , Clicking of query results in NDLTD • Visualization of NDLTD use and its community USP, Brazil, August 2006 128
Challenges • Preservation - so people with trust DLs • Supporting infrastructure - networks, . . . • Scalability, sustainability, interoperability • DL industry - critical mass by covering libraries, archives, museums, corporate info, govt info, personal info - “quality WWW” integrating IR, HT, MM, . . . – Need tools & methods to make them easier to build
- Slides: 129