PMC Tagging Guidelines A case study in normalization

  • Slides: 19
Download presentation
PMC Tagging Guidelines A case study in normalization Abigail Elbow, Breena Krick, Laura Kelly

PMC Tagging Guidelines A case study in normalization Abigail Elbow, Breena Krick, Laura Kelly NIH/NLM/NCBI/PMC JATS-Con | 9. 27. 2011

But first… PMC Overview What do those people do with data, anyway?

But first… PMC Overview What do those people do with data, anyway?

The PMC process: • 35 schemas • Validate against declared DTD • Transform into

The PMC process: • 35 schemas • Validate against declared DTD • Transform into JATS XML (Green Archiving DTD) • Check validity • Run Style Checker • Load to PMC database

What’s that look like?

What’s that look like?

Q: Why have a style checker? A: PMC is simply a user of the

Q: Why have a style checker? A: PMC is simply a user of the JATS / NLM DTDs

Can you be more specific? • More than one way to tag a structure

Can you be more specific? • More than one way to tag a structure • Need for normalization • Start with the basic & most inconsistentlytagged: ▫ Article metadata ▫ Figures ▫ Tables • Relax. NG schema used first • Replaced with XSL stylesheets ▫ Allow flexibility, reporting, and varying file output

Q: So how does anyone know how to tag in PMC style? A: The

Q: So how does anyone know how to tag in PMC style? A: The PMC Tagging Guidelines…

The Tagging Guidelines • HTML prose form of the style rules • General Tagging

The Tagging Guidelines • HTML prose form of the style rules • General Tagging Practice, Document Objects, Elements • Introduction and Update History • XML backbone • Covers PMC, NIHMS, and Bookshelf • Covers both 2. 3 and 3. 0

Tagging Guideline XML: @version

Tagging Guideline XML: @version

Tagging Guideline HTML

Tagging Guideline HTML

Tagging Guidelines: an element

Tagging Guidelines: an element

Q: How do I know if my file is compliant? A: The PMC Style

Q: How do I know if my file is compliant? A: The PMC Style Checker

Five common style errors • Math. ML • <related-article> tagging • <xref> and @ref-type

Five common style errors • Math. ML • <related-article> tagging • <xref> and @ref-type • DOIs • Empty elements Demo time: http: //www. pubmedcentral. nih. gov/utils/style_checker/stylechecker. cgi

Q: What if I have lots of files? A: NLM Style Checker stylesheets (v

Q: What if I have lots of files? A: NLM Style Checker stylesheets (v 4. 3. 4)

The Style Checker Stylesheets • Main file: nlm-stylechecker. xsl • It xsl: include(s): ▫

The Style Checker Stylesheets • Main file: nlm-stylechecker. xsl • It xsl: include(s): ▫ stylecheck-match-templates. xsl ▫ stylecheck-named-tests. xsl ▫ stylecheck-helper-templates. xsl • Reports: style-reporter. xsl ▫ Generates an HTML Error/Warning report

badstyle. XML

badstyle. XML

Another report view: (PMC Production)

Another report view: (PMC Production)

Special thanks • Laura Kelly • Breena Krick • Jeff Beck

Special thanks • Laura Kelly • Breena Krick • Jeff Beck

Resources: • PMC Tagging Guidelines: ▫ http: //www. ncbi. nlm. nih. gov/pmcdoc/taggingguidelines/article/style. html •

Resources: • PMC Tagging Guidelines: ▫ http: //www. ncbi. nlm. nih. gov/pmcdoc/taggingguidelines/article/style. html • PMC Online Style Checker: ▫ http: //www. pubmedcentral. nih. gov/utils/style_checker/stylechecker. cgi • Downloadable Style Checker stylesheets and instructions: ▫ http: //www. ncbi. nlm. nih. gov/pmcdoc/taggingguidelines/stylechecker/stylecheck-README. html • PMC Utilities: ▫ http: //www. ncbi. nlm. nih. gov/pmc/pub/validation/ • Tagging Guidelines email list: ▫ http: //www. ncbi. nlm. nih. gov/mailman/listinfo/pmc-tagging-guidelines