WebHarvesting WebHarvesting Concept Issues Prospects WebHarvesting Concept Issues
Web-Harvesting
Web-Harvesting ®Concept ®Issues ®Prospects
Web-Harvesting ®Concept ®Issues ®Prospects
Concept Web resource Web resource Web resource
Reference to Web Resource Harnad, Stevan (2004). “The Self-Archiving Initiative. ” http: //www. ecs. soton. ac. uk/ ~harnad/Tp/Nature 4. htm. Accessed last 15 September 2004.
The Web
The Web
The Web Harvester
3 Major Activities Imaging Web Archiving ® Digitization ® Migration ® Storage ® Migration ® Retrieval
Storage
Migration
Retrieval
Developments in Web Archiving ® ® ® ® Internet Archive NEDLIB Nordic Web Archive Amiga Realm Internet Archive Web. Archivist. org September 11 Web Archive Eprints. org
Web-Harvesting Concept ®WWW - publishing venue ®Web resources – non- permanent ®Web harvester - to store, migrate, retrieve web resources
Web-Harvesting ®Concept ®Issues ®Prospects
Web-Harvesting ®Concept ®Issues ®Prospects
Issues – Storage ® Legal justification ® Non-permanency of materials ® Daily changes ® Checksum ® No consistency in citations ® Refinement of criteria
Issues – Storage ® Several systems providing information ® Continued development ® Inaccessibility of data in databases ® Several information formats ® Overload of information ® Sufficient storage space
Issues – Migration ® Developments in information formats ® Developments in hardware, operating systems, and software
Issues – Retrieval ® Need for registries ® Completeness of metadata ® Commercial vendor or not? ® Legal or illegal?
Web-Harvesting ®Concept ®Issues ®Prospects
Web-Harvesting ®Concept ®Issues ®Prospects
Prospects ® Future harvesters will be more powerful ® Overflow, duplication ® Current options: ® Self-archiving by universities ® Self-archiving by authors ® Burning of cited web pages ® Printing of cited web pages
Have a nice day!
- Slides: 24