- Slides: 41
LIS 571 Readings Reaction Assignment Topic 3: Representation and Metadata by: Amanda Alessi also by: David Salley and Andrew Kloc
Thought Question: Has the research and development of metadata schemes designed for specific users and/or collections moved the library and information profession forward in our task to provide the best access possible for our users?
Readings: Duval, E. et al. 2002 Metadata Principals and Practicalities, D-Lib Magazine, Schottlaender, B. 2003. Why metadata? Why me? Why now? Cataloging and Classification Quarterly, v. 36 (3/4) pg. 19 -29 Taylor, A. 2004 Metadata in The Organization of Information. 2 nd ed. Westport CT. [Ch 6, pg 139158] Schatz, Bruce R. 1997 Information Retrieval in Digital Libraries: Bringing Search to the Net Science, v 275 #5298 pg. 327 -334
Article Metadata Principles and Practicalities by Erik Duval, Wayne Hodgins, Stuart Sutton, and Stuart L. Weibel
Introduction Rapid development of the World Wide Web created information chaos. A Metadata helps organize information. It is broadly defined as data about data (Duval, etc. 2002). B
Principles c Concepts common in all metadata schema. Schema—attribute/value or element set. 1. Modularity —machine interoperability; Standards allow flexibility. Lego metaphor D
Namespaces work with modularity Formal collection of terms managed according to policy or algorithm (Duval, etc. 2002). Example: HTTP and LCSH E Any metadata element set is a namespace bounded by rules determined by its managers (Duval 2002). Namespaces allow metadata schema designers to keep a term uniquely defined. Example: Dublin Core metadata always starts with dc:
2. Extensibility Some metadata elements are popular in metadata schemas, while others aren’t. Example: Creator vs. temperature F A base schema is a goal with room for additional elements that each unique community can apply to itself.
3. Refinement Level of detail is different for any given purpose. G Makes an element more specific. Example: Composer vs. Creator. Involves expression of dates and times. Example: 04/07/03= April 7, 2003 or July 4, 2003. H
4. Multilingualism Metadata must accept working with different languages and cultures. J Example: <LI> K Translate metadata standards into multiple languages. Metadata can describe the original resource’s language and culture. --can mention other available versions of the resource and contact information for the translator. Other ways cultures communicate differently? L
Practicalities “Rules of thumb, constraints, and infrastructure issues that emerge from bringing theory into practice in the form of useful and sustainable systems” (Duval, etc. 2002) Principles above lead to these practicalities.
1. Application Profiles An assemblage of metadata elements selected from one or more metadata schemas and combined in a compound schema (Duval, etc. 2002) Means of expressing principles of modularity and extensibility Make rules such as this: required data element=language, optional one=color. M
2. Syntax and Semantics—meaning; syntax—form HTML (hyper text markup language) is simple. This is good and bad. N XML is markup language of choice. O
3. Association Models --ways to associate metadata with resources Embedded metadata—created by the author of the resource P Associated metadata—kept in separate files; change the metadata, not resource. Q Third party metadata—filed by an organization that may or may not have control over the resource
4. Naming Metadata Elements Each element set must have a globally addressable name or URI (Uniform Resource Identifier). Makes machine processing of metadata possible despite different languages or cultures
5. Metadata Registries “Important topic of digital library research at this time” Contain or link to controlled vocabularies from which the values of metadata fields are selected (Duval, etc. 2002) “electronic dictionary” (Duval, etc. 2002) Who will use a registry? n n n Application designers to identify schemas Creators of metadata to get definitions for elements End users to better understand the context of metadata in hopes of improving their searches
6. Completeness of Description Not every available element should be used for every resource type. Example: No scent field for a map. Detailed Description n Improves searching precision Requires higher investment in creation of metadata (Duval, etc. 2002) Makes it more difficult to promote consistency (Duval, etc. 2002)
Simple Description n Easier and cheaper to make May result in more false hits or more effort to pick the most relevant results (Duval, etc. 2002) Improves chances of interoperability
7. Subjective and Objective Metadata can be completely unbiased: author, date of publication, edition or version R Fields become subjective when they come to mean different things to different cultures. Semantics is compromised. Example: keywords, summaries, reviews
8. Automated Generation of Metadata Before the Web, there were librarians cataloging Cataloging metadata “remains the most successful standard for resource discovery of books and periodicals” (Duval, etc. 2002) Costly Impractical for Internet materials/resources
Web Search Engines Index lots of the Internet Low cost, advertiser supported model Type of metadata Advances in natural language processing, profile and pattern recognition, data mining Electronic paper like PDF allows authorsupplied attributes to simplify making metadata S
Conclusion Information useful if organized; organizer role metadata Those who create metadata will have different motives, goals, and techniques just like authors of books have different ways of writing Communities must agree on rules and common procedures in order to understand share information across cultures
Schottlaender, B. Why metadata? Why me? Why now? Reviewed by: David Salley
Metadata Definitions “a cloud of collateral information around a data object” “structured, encoded data that describes characterization of information-bearing entities to aid in the identification, discovery, assessment and arrangement of the described entities”
Metadata Schema “A set of rules for encoding information that supports specific communities of users” -- Association for Library Collation and Technical Service Committee on Cataloging Description and Access Task Force
Metadata Schema – 5 Years Ago Only 4 types Descriptive Administrative Technical Rights
Metadata Schema – Today Many more types, including: Security Personal Information Commercial management Content rating Preservation Metadata Etc.
Cataloging and Metadata “Metadata is about access, cataloging is about access” – Schottlaender “The invisible process of order making” – Kevin Butterfield Four steps to each: find, select, identify, obtain.
Relevant Quotations: “I see an increasing confluence between the cataloging and metadata communities, so much so, that the two communities are becoming harder to distinguish, which is exactly as it should be. ” “There is a growing recognition in the metadata community of the relevance of the work that we in the library cataloging community have been doing. ” “The Dublin Core ‘qualifiers’ … are … an attempt being made now to enrich the Dublin Core Element Set by referencing a variety of content standards: subject thesauri; authority control systems; and classification systems.
Metadata and Representation: Taylor, A. 2004. Metadata. In the Organization of Information. 2 nd ed. Westport, CT. : Libraries Unlimited inc. [Ch. 6: 139 -158].
Definitions “Data about data” n Simple definition, causes confusion “Structured information that describes the attributes of information packages for the purposes of identification, discovery, and…management. ”
Why Metadata? “Hot topic” in LIS in recent years n Stems from proliferation of electronic resources Major Significance: n n “Concern that some kind of standardized representation is needed for Internet resources” In order to locate the most “useful and reliable” information available
Types and Levels of Metadata 3 levels of complexity: n n n simple format (found within resource itself) structured format (ex: Dublin Core) rich format (MARC, AACR 2) 3 broad types: n n n administrative structural descriptive
Metadata Schemas Def: sets of metadata standards designed to meet the needs of particular communities 3 Characteristics: n n n Structure - refers to data model and how metadata statements are expressed Syntax - encoding scheme Semantics - refers to meaning of data elements Also use content standards and controlled vocabulary
Special Characteristics of an Electronic Environment Interpolarity - ability of different systems to interact with each other Flexibility - ability to enter as much or as little information as needed Extensibility - ability to use additional elements or qualifiers (more specified elements) to meet specific needs.
Objectives for Implementing an Information System Find Identify Select Obtain
Schatz, Bruce R. Information Retrieval in Digital Libraries: Bringing Search to the Net Reviewed by: David Salley
“Organized collections of scientific materials are traditionally called ‘libraries’ and the searchable online versions of these are called ‘digital libraries’. The primary purpose of digital libraries is to enable searching of electronic collections distributed across networks, rather than merely creating electronic repositories from digitized materials. ”
“The fundamental technology for searching large collections is finally changing, so that information retrieval in the next century [originally published in 1997] will be far more semantic than syntactic, searching concepts rather than words. ” “The ’document’ has changed from a citation with descriptive headers to the abstract to the complete multimedia contents, including text, figures, tables, equations, and data.
Technology Timeline 1960’s Citations only [title, author, journal, keywords] generating bibliographies 1970’s Inverted Indexes, full text storage, proximity mapping 1980’s distributed personal workstations 1990’s The Internet, full multi-media
Vocabulary Switching Example: A journal article that only mentions ‘Unix’ would be tagged as being about ‘operating systems’ Currently being done by human indexers Hewlett-Packard, et al. currently working on computer systems to do this working with extensive ‘authority files’