Thomas Krichel http openlib orghomekrichel Background Trained economist
Что делать? Thomas Krichel http: //openlib. org/home/krichel
Background • Trained economist, 1984 to 2000 • Worked on electronic dissemination of academic papers in Economics since 1993 • Activist of free online scholarship & academic self -documentation • Pioneered business model of Open Archives Initiative • Professor for Library and Information Science
The basic idea • Scholars are not paid for writing scholarly papers. – Simply a historical fact – We assume that this will not change as we move into a more online/digital future • Publishers appropriate copyright to sell one academic the output that is freely given up by another. • Socially inefficient
Topic • Author self-archiving • Free online scholarship • Academic self-documentation • Academic disintermediation
Harnad Steady State Analysis • Toll-gated academic publishing in the Gutenberg world with positive marginal costs. • Post-Gutenberg world leads to abolition of toll -gates. • All that scholars have to do is self-archive. • Refutes all arguments against self-archiving. • Analysis limited to free access to academic papers, not related data
The dynamics do matter • Toll-gated layer exists • publisher • editors • scholarly societies • A free layer is slowly starting – how to create “prepublication” tradition – how is the system to be funded – which organizational model • discipline-based • institution-based
Anti-Harnad analysis • Academic world has always needed nonacademic intermediaries. • Important documents are revealed in the marketing process • Money makes the world go around.
Interim phase • A time for visionaries • Important first-mover advantage • Time of bad business models and failed plans – NCSTRL – Cog. Prints • Fight between the intermediaries of scholarly communication (publishers and libraries)
Disintermediation • Publishers wish to reduce libraries to bursars of funds and access control. • Libraries want to get into the publishing business. • Academics will have to decide which (if any) vision will work.
Institution-based initiatives • Idea: libraries of universities should make papers from all disciplines available on institutional servers • Problem: low incentives for academics to collaborate – prime solidarity of scholar with discipline – no preprint tradition
Putting it up on the web • Prepublication by individuals over the web is an important step • Problems are: – stability of document existence and location, – thus impossible to use as a building block for a review of any kind – information retrieval difficulties – no certification of finding
Arms’ theory • Free research library out of documents that are made available – by academics directly – by intermediaries through author pressure • Bibliographic layer costly to maintain and will be left to commercial entities.
Discipline-based systems • For the time being, they only work in the preprint disciplines – Mathematics – Physics leading to centralized systems • and in the working paper disciplines – Computer science – Economics leading to decentralized systems
Polymorph scenario • Development per discipline • Different stages or scenarios for each group • Institution based archives help, but do not complete the picture • What is the complete picture?
Recall Bellman… • To find an optimal path over time, find the optimum steady state • Then calculate the optimum path leading to that steady state… • (in practice, a problem of time-consistency arises but we will ignore this problem)
Optimal steady state? • All papers freely accessible online • Anybody can enter the academic process and make a paper available • Extensive linking – Papers to authors – Authors to institutions – Document to document (review, references) – Document to group (peer-review)
What material to be deposited • The papers written be reasonable authors, recognized as genuine scientific writers • In practice, it is those authors that are affiliated with an academic, or otherwise recognized institution who produce such output.
Top to bottom approach • Register institutions first • Register authors second • Make papers available third Contrast that with the approach of librarians….
Quality control • Elementary quality control through the affiliation of authors. • No other basic means available in contested disciplines. • No other elements of quality control • but of course potential for extensive quality appraisal.
Global/Local approach • Registry of all institutions is a difficult process • Local registry per discipline possible – EDIRC – World List of LIS departments
Incentives • Free access improves exposure of research (see work by Steve Lawrence) • But the general promise not enough for – Deposit on web – Formal deposit in a free system • Formal deposit system has to demonstrate that is does well • Lies, damn lies, and statistics
Impact • Impact is key to academic work in almost all disciplines. • Scientists have small change, but big ego. • When impact can be quantified, academics start to listen, and they forget about Churchill. • Our dean can’t read, our dean count.
Peer review to impact review • Collection quality control through peerreview is part of the Gutenberg universe • Pure impact review – Access logs / Download logs / Citation counts – Promotes open access • Global impact review – Uses data of grouping of papers
Impact review formula requirements • Based on the production of authors • Only indirectly modifiable by authors – Deposit more papers – Deposit better papers • Three elements of peer evaluation – Citations – Collection inclusion – Collection review
Citations • Often criticized, but the only means that we have to assess impact between papers • Publicly accessible citations indexes will go a long way to promote open scholarship. • Such indexes can be constructed by computer • Since citation styles are widely different, an approach per discipline is required.
Collection inclusion • This is classic peer-review. • Post-Gutenberg age should allow inclusion in several collections. • But copyright surrender prevents this. • Combat this in principle, but don’t expect results any time soon. • Promote licensing of publishers.
Collection evaluation • Classic ISI citation impact reviews • This data can be used in initial version of an impact review formula. • Later that data should be endogenized.
Contents is king • Nothing can be done without an initial stock of papers. • All disciplines have some form of informal publication channel. • Collect and archive elements in these channels. • Get a coalition of collectors together.
Genesis 3: 19 • Without volunteer efforts things will not get done. • In particular, the people on the upper echelons have to be volunteers. • People can not be expected to be paid for the collection work. • Infrastructure work can be supported through funding.
GNU Thinking • When Richard Stallman launched the GNU project many people thought he needs to get in the shade. • I guess some of you think that about me! • When computer geeks can make a complete operating system available over the Internet at no cost. • Decentralization is the word of the day.
Tools and tasks • Tools – Open Archives Initiative Protocol for Metadata Harvesting – Academic Metadata Format • Tasks – – Deposit Describe Identify Relate
OAI and task model • Free online scholarship through open archives doing the first two tasks. • “Aggregators” will be needed to perform the two other tasks. Can also use the OAI protocols.
AMF and task model • AMF appears as a basic framework for aggregators to communicate with basic data providers and export data. • Aggregator will need to set database structure from pile of AMF data.
Example from. Re. PEc http: //netec. mcc. ac. uk/Wo. PEc/data/Papers/cre crefwp 99. html
Vielen Dank!
ar. Xiv • Too well-known to talk about here • So I will talk more about Re. PEc. • One important development: ar. Xiv will start to identify authors.
Re. PEc • Comprehensive academic self-documentation system • in fact, the very essence of an academic selfdocumentation system – run in a decentralized way by academic volunteers – comprehensive picture of academic output activity • originates with Wo. PEc project founded by Thomas Krichel in 1993 • And so on…
Re. PEc principle • Many archives – archives offer metadata about digital objects (mainly working papers) • One database – The data from all archives forms one single logical database despite the fact that it is held on different servers. • Many services – users can access the data through many interfaces. – providers of archives offer their data to all interfaces at the same time. This provides for an optimal distribution.
Re. PEc is based on 190+ archives • • • Wo. PEc Econ. WPA DEGREE S-Wo. PEc NBER CEPR • • • US Fed in Print IMF OECD MIT University of Surrey CO PAH
…to form one dataset. . . • over 140, 000 items in over 1, 000 series, contains working paper, published paper, software, personal and institutional data • largest distributed free source about online scientific publications, over 45, 000 electronic papers • data is encoded using the purpose-built Re. DIF format • all archives follow a convention called the Guildford protocol on how to store Re. DIF files and other data on their servers. Therefore the archives can be mirrored.
… describes documents Template-Type: Re. DIF-Paper 1. 0 Title: Dynamic Aspect of Growth and Fiscal Policy Author-Name: Thomas Krichel Author-Person: Re. PEc: per: 1965 -0605: thomas_krichel Author-Email: T. Krichel@surrey. ac. uk Author-Name: Paul Levine Author-Email: P. Levine@surrey. ac. uk Author-Work. Place-Name: University of Surrey Classification-JEL: C 61; E 23; E 62; O 41 File-URL: ftp: //www. econ. surrey. ac. uk/ pub/Re. PEc/surrec/surrec 9601. pdf File-Format: application/pdf Creation-Date: 199603 Revision-Date: 199711 Handle: Re. PEc: surrec: 9601
… describes persons (Ho. PEc) Template-Type: Re. DIF-Person 1. 0 Name-Full: KRICHEL, THOMAS Name-First: THOMAS Name-Last: KRICHEL Postal: 1 Martyr Court 10 Martyr Road Guildford GU 1 4 LF England Email: t. krichel@surrey. ac. uk Homepage: http: //gretel. econ. surrey. ac. uk Workplace-Institution: Re. PEc: edi: desuruk Author-Paper: Re. PEc: surrec: 9801 Author-Paper: Re. PEc: surrec: 9601 Author-Paper: Re. PEc: rpc: rdfdoc: concepts Author-Paper: Re. PEc: rpc: rdfdoc: Re. DIF Handle: Re. PEc: per: 1965 -06 -05: THOMAS_KRICHEL
… describes institutions (EDIRC) Template-Type: Re. DIF-Institution 1. 0 Primary-Name: University of Surrey Primary-Location: Guildford Secondary-Name: Department of Economics Secondary-Phone: (01483) 259380 Secondary-Email: economics@surrey. ac. uk Secondary-Fax: (01483) 259548 Secondary-Postal: Guildford, Surrey GU 2 5 XH Secondary-Homepage: http: //www. econ. surrey. ac. uk/ Handle: Re. PEc: edi: desuruk
Weaknesses of Re. PEc • No funding • Difficult to grasp innovative concepts – relational database for the academic process – plethora of user and contributor services • Setting-up costs are large, constant attention required • Little support from the top of the academic food chain
Academic Metadata Format • Data and metadata for action. Librarians have only documented the world; what matters is to change it. • Tool for academic self-documentation – simple to compose – drop-in functionality with OAI • intuition that comes from natural language
Open Archives Initiative • Most important for Free Online Scholarship is the implicit shift in business model towards institution-based archiving.
AMF View of the world • Author self-archiving will work if it is part of the advertisement of academics • Creator has to be the descriptive focus, not the creation
A model of AMF instances • • Persons Institutions Collections Resources – Text This is what is really important about AMF
Natural language • Nouns – person, organization, collection, text • Adjective like – name, title, status, etc • Verbs like – isauthorof, hassponsor, ispartof etc
Example 1 <person><name>Simeon M. Warner </name><isauthorof><text><tit le> AMF Design in brief<title> <file><url>http: //openlib. org/h ome/krichel/southampton_200107 -13_1. ppt</url></file> </text></isauthorof></person>
id and ref • For propeller head use. • Records (instances of nouns) that are authoritative can have an id. • Non-authoritative records can refer to authoritative ones, using a ref.
Example 2 <text ref=oai: arxiv: . . . > <references> <text ref=oai: arxiv: . . . /> </references> </text>
- Slides: 52