What Agencies Should Know About PDFA September 20
What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan. sullivan@nara. gov
Introduction • Agenda – Why long term preservation of PDF is an issue – Discussion of PDF/A Standard and NARA’s Transfer Guidance for Permanent PDF records – Roles of both PDF/A and the NARA’s PDF Transfer Guidance in Federal recordkeeping – Overview of PDF/A and the ISO Process – Conclusion and Questions
Wide Use of PDF • PDF is a ubiquitous open format for electronic documents – Proprietary, but with publicly available specification • The feature-rich nature of PDF can complicate preservation efforts • All PDFs not created equal • Much important information maintained in PDF • Permanent archival records, in some cases.
PDF Not a Suitable Archival Format • PDF itself is not suitable as an archival format. – Some Features not compatible with current archival requirements – Not necessarily self-contained – All PDFs are not created equal • Long-term solution needed – Permanent archival records, in some cases – Administrative Office of U. S. Courts initiated idea for an ISO Standard based on PDF (PDF/A)
How NARA is Addressing PDF • Issued PDF Transfer Guidance – Allowing agencies to transfer permanent records to NARA in PDF In March of 2003, NARA • Participating in PDF/A ISO Standard Development – To influence the process – To gain knowledge
Transfer Format versus File Format NARA’s transfer guidance and PDF/A have a similar goal…. . to ensure that valuable electronic information in PDF is not lost But different purposes: • Transfer Format - NARA’s PDF Transfer Guidance – Specifies NARA transfer requirements – Applies to existing and future records in PDF • File Format - The PDF/A ISO Standard (PDF/A) – Specifies a subset of the PDF file format – More format reliability/fewer in “bells & whistles” – PDF should be maintained longer as PDF (e. g. , within agencies)
Scope and Usage NARA’s PDF Transfer Guidance • Usage: Transfer existing permanent PDF records to NARA Permanent PDF Records • Scope – Applies to permanent records – PDF 1. 0 - 1. 4 – Quality criteria, laws and regulations, transfer documentation, NARA contact information PDF/A ISO Standard • Usage: Programming Specification • Scope – Addresses one aspect of long term preservation (i. e. , file format) – Should be used as one piece of the archival puzzle
Requirements - PDF/A and NARA’s PDF Transfer Guidance Embedded fonts • PDF/A and NARA’s PDF Transfer Guidance both require that fonts be embedded – NARA Guidance phases in requirements for workstation resident fonts. Encryption • PDF/A and NARA’s PDF Transfer Guidance both prohibit encryption – NARA Guidance phases in requirement as long as we can open, view and print
Requirements - PDF/A and NARA’s PDF Transfer Guidance Special Features • PDF/A restricts special features – Embedded files, external links, Java Script – PDF/A promotes tagged PDF as a higher level of conformance • NARA evaluates special features on a case-by-case basis at the time of scheduling Metadata/Documentation • PDF/A requires that embedded metadata must be in Adobe XMP • NARA requires transfer documentation (e. g. , SF-258), and would evaluate embedded metadata at the time of scheduling
Requirements - PDF/A and NARA’s PDF Transfer Guidance Quality Requirements • PDF/A as a file format does not address quality/creation requirements such as exact replication of source material – Informative Annex B - identifies recommended creation guidelines – Agencies must implement these guidelines to comply with NARA’s PDF transfer guidance • NARA’s PDF Transfer Guidance includes – quality requirements regarding scanning quality, – lossy compression – substitution of characters with OCR’d text
NARA’s Expectations for PDF/A – PDF/A should address some of the PDF archival issues and enable PDF records to be maintained longer as PDF – Standard maintained by ISO, not just vendors – Agencies should implement PDF/A along with records management policies and procedures • Such as…. – NARA’s PDF Transfer Guidance – AOUSC’s document management program
The PDF/A Standard • Multi-part ISO International Standard – ISO 19005 -1: 2005, Document management – Electronic document file format for long-term preservation – Part 1: Use of PDF 1. 4 (PDF/A-1) – Part 2 (19005 -2) intended to bring PDF/A into conformance with PDF 1. 6 – And additional future parts, as necessary
Time Line for Part 1 • Submitted to ISO Central Secretariat for publication as International Standard – Should be publicly available September 2005 • Throughout the process, PDF/A has been reviewed by technical experts from 15 national standards bodies
PDF/A - Approach • PDF/A specifies: – The subset of PDF components, from the PDF 1. 4 Reference), that are either required, restricted, or prohibited, and – How these components may be used by software Specifies required features Specifies restricted features PDF 1. 4 Reference Specifies prohibited features PDF/A
PDF/A - Requirements • Disallows or limits features that could complicate long term preservation, and • Maximizes: – Device independence • Can be reliably and consistently rendered without regard to the hardware/software platform – Self-contained • Contains all resources necessary for rendering – Self-documenting • Contains its own description – Transparency • Amenable to direct analysis with basic tools
PDF/A - Table of Contents • • • 1 Scope 2 Normative References 3 Terms and Definitions 4 Notation 5 Conformance Levels 6 Technical Requirements – 6. 1 File Structure – 6. 2 Graphics – 6. 3 Fonts – 6. 4 Transparency – 6. 5 Annotations – 6. 6 Actions – 6. 7 Metadata – 6. 8 Logical Structure – 6. 9 Interactive Forms • Informative annexes – Annex A - PDF/A-1 Conformance Summary – Annex B - Best Practices for PDF/A • Bibliography
Annexes of the Draft PDF/A Standard – Informative Annexes • Informative Annexes provide supplemental information including: – Summary of the PDF structures and components disallowed, required, or limited – Best Practices for PDF/A • Guidelines for capturing or converting electronic documents to PDF/A – To replicates the exact quality and content of source documents – Required for compliance with NARA’s PDF Transfer Guidance
PDF/A - Overview of Requirements • Two levels of conformance – Level A (e. g. , Tagged PDF, UNICODE Mapping) – Level B (e. g. No Tagged PDF) • Uniform file format (header, trailer, no encryption) • Device-independent rendering of graphics • Embedded fonts, character encoding • Annotations restricted, content should be displayed by readers • External actions restricted, no dependence on external content • Readers not required to act on hyperlinks, but may • XMP metadata “Adobe XML Metadata Framework” • Forms based on appearance, not data
Take Away • For permanent records in PDF, agencies need to understand that: – PDF/A is one option for long-term preservation of electronic documents – PDF/A, by itself, does not guarantee exact replication of source material – Agencies must implement PDF/A in conjunction with additional requirements to meet NARA standards for transferring permanent records to NARA (i. e. , NARA’s PDF Transfer Guidance)
More Information is Available • More information on NARA’s PDF Transfer Guidance on NARA’s Web Site – http: //www. archives. gov/records-mgmt/initiatives/pdf-records. html • More information on PDF/A on AIIM Web Site – http: //www. aiim. org/standards. asp? ID=25013 • Contact Susan Sullivan at susan. sullivan@nara. gov
Questions/Discussion
- Slides: 21