THE FRONT MATTERS Capturing Journal Front Matter Content
- Slides: 53
THE FRONT MATTERS: { Capturing Journal Front Matter Content with JATS
Front Matter vs. Journal Matter (disambiguation) For the purposes of this presentation: “front matter” = “journal matter” In the current publishing environment where more and more journals are published online, there are many examples of journals without a traditional “front”.
Obvious
This… not as much
Rachael Carter a journal manager at PMC at the National Center of Biotechnology Information at the US National Library of Medicine. Rachael graduated in 2010 from the University of Maryland with a Masters of Library Science. Kathryn Funk a technical editor for NIHMS and Pub. Med Health at the National Center of Biotechnology Information at the US National Library of Medicine. Kathryn graduated from The Catholic University of America with a Masters of Library and Information Science. Rebecca Mooney formerly a journal manager at PMC at the National Center of Biotechnology Information at the US National Library of Medicine, recently moved to a new position as a Project Analyst in the IT Department of the American Association for the Advancement of Science (AAAS). Rebecca graduated in 2008 from the University of Maryland with a Masters of Library Science. Team Introduction
“Decisions must be made about what will actually be saved for future use… Will the content consist only of articles in a journal, or will it also include front matter (such as the names of the members of the journal’s editorial board)? ” Marcum, 2001 The Big Picture
PMC as an archive has a responsibility to answer: What we should preserve? How we should preserve? Why preserve? NLM Initiative
PMC Submission Method A
• • Currently, PMC strives to archive data at the article level, but sees the potential benefit in finding a way to preserve information about the journal that the articles were published in, such as who was Editor in Chief at the time of publication? What was the journal’s philosophy at this time? Etc. TOCs: PMC creates their one table of contents, organized by article-type. Still very article based, not at the issue level. PMC structure
Front Matter “capturing” in PMC as it currently exists – through banner journal-links only
What PMC Front Matter IS Editorial board Journal philosophy Submission guidelines Subscription information Covers Journal contact information Publisher information What PMC Front Matter is NOT Tables of contents Advertisements Forewords Prefaces Scope of Front Matter within project
Frontmatter DTD development Timeline 2001 NLM DTD developed issueadmin. dtd was made available 2011 Atypon Issue XML presented at JATS-Con 2012 pmcjournalmatter. dtd developed
Limitations of PDF - Assumes there is an issue to scan - Difficult to update content - Limited to certain platforms and technologies XML to the rescue - The content is queryable and reusable - Updating just requires editing a file - Allows for data manipulation over various platforms/formats Value of capturing front matter as XML
o Mostly because we already use JATS o It’s flexible o o Already had meaningful framework to capture journal article content Works well within the structure of PMC • consistency Why we chose to create an extension to JATS
Why JATS isn’t enough to capture front matter: No meaningful way to capture front matter elements such as editorial boards No way to tag journal metadata at a level higher than article-meta Limitations of JATS
To capture front matter in the environment in which it was published To work as much as possible with the existing JATS framework To create a DTD that would allow for flexibility in both use in rendering Goals
Tagged samples of front matter using our DTD and made adjustments Looking at samples Defined content types Completed first iteration of the pmcjournalmatter. dtd Created new elements Testing 1 2 3 Adjustments made to final DTD based on user feedback User testing: PMC journal managers
Highlighted physical example of a journal’s front matter
Anything in RED is required <journal-meta> contains, in order: • <journal-id>* • <journal-title-group> • <issn>* • <isbn>* • <publisher>? <issue-meta> contains, in order: • <pub-date>* • <volume>? • <issue-title>* • <issue-sponsor>* • <first-page><last-page>? <page-range>? OR <elocation-id>? <document-meta> contains, in order: • <pub-date>* • <document-title> • <self-uri>* <body> contains, in order: • <person-list> requires one or more <person> • <person> contains, in order: • <name> OR <string-name> OR <collab> • <degrees>* • <address>* • <aff>* • <role>* • <ext-link>* • <xref>* Initial Classification
> t s i l n o s <per <issue-meta> > a t e m - t n e m u c o d < Created new elements
Tagged samples of front matter using our DTD and made adjustments
User testing: PMC journal managers
. mod . ent pmcjournalmatter. dtd pmcjournal matter custom. ent customizations DTD technical details
<journalmatter-type="issue" content -type="edboard"> Root element: journalmatter
How to generate a foundation for organizing and labeling the front matter content? Answering the question of can we tag all of this content in one document? Challenges
Root element attribute: @journalmatter-type
Prevents hybrid of issue and non-issue content in the same document Changes in content can be more easily updated Allows a single journal to have issue and standing documents Issue vs. Standing: The Benefits
issue - Cover standing – Information of Authors Example: Standing & Issue
@content-type Separate documents Flexibility In tagging and rendering Update as need be EX: Journal philosophy vs. ed board Root element: @content-type
edboard other cover @contenttype info-forauthors generalinfo publisher Individual documents for each @content-type.
Cover ("cover"): can include cover image, caption, and cover image copyright information. Editorial Board ("edboard"): can include executive editors, associate editors, etc. as well as general editorial board members. General Journal Information ("general-info"): can include but is not limited to journal mission statement, scope, journal contact information, subscription information, copyright, and other journal-specific content. Publisher Information ("publisher"): can include publisher philosophy, other journals published, contact information, etc. Information for Authors ("info-for-authors"): can include article submission and formatting instructions. Other ("other"): if the document is not one of the listed types or the type of document cannot be determined, the "other" attribute value may be used. @content-type values
The 4 Main elements of a document <doc u > a t e m e <issu eta> -l m a n r u o j < ment -meta > y d bo < <journalmatter> >
<!ENTITY % journal-meta-model "(journal-id*, journal-title-group*, issn*, isbn*, publisher*)"> <journal-meta>
<!ENTITY % issue-meta-model "(pub-date*, volume? , issueid*, issue-title*, issue-sponsor*)"> <issue-meta>
<!ENTITY % document-meta-model "((document-title, document-subtitle? )? , contrib-group? , pub-date*, (((fpage, lpage? , page-range? ) | elocation-id)? ), self -uri*, permissions? )" <document-meta>
Borrowed directory from JATS (with a few additions) <body>
<!ELEMENT person-list (title? , person+) > Addition: <person-list>
Person-list vs. Person-group
advisory-board: A board appointed to advise the editorial board editor: Content editors editorial-board: A group of editors on a publication guest-editor: Content editors that have been invited to edit all or part of a work reviewer: Content reviewer transed: Editors of a translated version of a work @person-list-type
Not required – suggested list Not controlled attribute Only used when content-type=“general-info” Intent was to give meaning for searching and grouping purposes. Used similarly to JATS’ @sec-types @sec-type
@sec-type is not a required or controlled attribute. However, when "general-info" is the @content-type of the document, the following is a suggested list of types: association* copyright journal-contact journal-philosophy subscription-info *This refers to associations which may be affiliated with a journal but does not necessarily publish the journal. List of @sec-types
http: //dtd. nlm. nih. gov/ncbi/pmc/journalmatter/ DTD Documentation
? So how’s it all going to look?
Still relatively untested No rendering No actual use Lack of an existing model Based on perceived needs of PMC as an archive. Unanticipated uses beyond. Different naming conventions and structures of published journal front matter Limitations
Trying to start a conversation Looking for ways to best capture to suit needs both inside PMC and the broader JATS community Determining whether the content types will be applicable for future applications Initiating the usage for the DTD and seeing what happens Looking Forward
Breena Krick Jeff Beck Audrey Hamelers Christopher Maloney PMC Journal Managers Acknowledgements
Andrew N. . The Oxford Journals Online Archives: The Purpose and Practicalities of a Major Digitization Program. Serials Review. (2006. June). 32(12), 78 -80. Holdsworth David. Preservation Strategies for Digital Libraries. Glasgow, UK: HATII, University of Glasgow; DCC Digital Curation Manual. (2007. November). Retrieved from: http: //www. dcc. ac. uk/resource/curation-manual/chapters/preservationstrategies-digital-libraries. Marcum D. Scholars as Partners in Digital Preservation. CLIR Issues. (2001. March/April)20. Retrieved from: http: //www. clir. org/pubs/issues 20. html. Markantonatos N. Article vs Issue XML: Capturing the Table of Contents under the NLM DTD. Bethesda, MD: National Center for Biotechnology Information; Journal Article Tag Suite Conference (JATS-Con) Proceedings 2011. (2011). Retrieved from: http: //www. ncbi. nlm. nih. gov/books/NBK 57236/. . Wheeler B. Journal Identity in the Digital Age. Journal of Scholarly Publishing. (2010. ) 42(1), 45 -88. NLM Journal Archiving and Interchange Tag Suite. Retrieved from: http: //dtd. nlm. nih. gov/. PMC Journal Matter DTD Documentation. Retrieved from: http: //dtd. nlm. nih. gov/ncbi/pmc/journalmatter/. BMC Cancer. Retrieved from: http: //www. biomedcentral. com/bmccancer/. Frontiers in Cancer Genetics. Retrieved from: http: //www. frontiersin. org/cancer_genetics. References
pmc@ncbi. nlm. nih. gov Contact us
Questions?
1 XML document: content-type= “standing” OR “issue” 2 document: 1 content-type=“standing 1 content-type=“issue” Cover “standing” “issue” “cover” Editorial Board General Publisher Information Multiple documents: Journal Information for Authors Dependent on Information “edboard” “general-info” information being “publisher” “info-forcaptured authors” “publisher” “info-forauthors”
- Napapanahong papel
- Dynamic content vs static content
- Real content and carrier content in esp
- Composition of matter section 1
- Gray matter and white matter
- Chapter 2 section 1 classifying matter answer key
- Telecephalon
- Section 1 composition of matter
- Cerebral aqueduct
- Energy naturally flows from warmer matter to cooler matter.
- Gray and white matter
- Composition of matter section 1
- Capturing kids hearts four questions
- Automatic data capture methods
- Architecture significant requirements
- Image capturing devices
- Research data lifecycle
- Marketing information system kotler
- Capturing quantities
- Forecasting and demand measurement in marketing
- Capturing quantities
- Capturing reality documentary
- Creating and capturing customer value
- Capturing marketing insights
- Capturing quantities
- Capturing value from customers
- Capturing customer mindset
- Enclosing hood
- Pricing: understanding and capturing customer value
- Creating value and capturing value
- Example of data capturing
- Capturing kids hearts excel
- 5 core customer and marketplace concepts
- The process of capturing moving images on film
- Qualitative research techniques to measure brand equity
- Khi nào hổ mẹ dạy hổ con săn mồi
- Từ ngữ thể hiện lòng nhân hậu
- Diễn thế sinh thái là
- Vẽ hình chiếu vuông góc của vật thể sau
- Làm thế nào để 102-1=99
- Tỉ lệ cơ thể trẻ em
- Lời thề hippocrates
- Vẽ hình chiếu đứng bằng cạnh của vật thể
- đại từ thay thế
- Quá trình desamine hóa có thể tạo ra
- Công thức tiính động năng
- Môn thể thao bắt đầu bằng chữ f
- Khi nào hổ mẹ dạy hổ con săn mồi
- Thế nào là mạng điện lắp đặt kiểu nổi
- Hát kết hợp bộ gõ cơ thể
- Dot
- Nguyên nhân của sự mỏi cơ sinh 8
- độ dài liên kết
- Chó sói