Advisory Committee on the Electronic Records Archives April

  • Slides: 28
Download presentation
Advisory Committee on the Electronic Records Archives April 29 -30, 2009 Program Director’s Update

Advisory Committee on the Electronic Records Archives April 29 -30, 2009 Program Director’s Update

Topics v. Development and deployment of the ERA instance for the G. W. Bush

Topics v. Development and deployment of the ERA instance for the G. W. Bush presidential records v. Plans for further development

Where is ERA?

Where is ERA?

Rocket Center, WV

Rocket Center, WV

Erma Ora Byrd Conference & Learning Center

Erma Ora Byrd Conference & Learning Center

The Search & Access ERA Instance for G. W. Bush Electronic Presidential Records

The Search & Access ERA Instance for G. W. Bush Electronic Presidential Records

What Does the Base ERA Do? v. Focus: q q q Federal Records Nationwide

What Does the Base ERA Do? v. Focus: q q q Federal Records Nationwide records management program National Archives v. Functions: q q Creation, review and approval of records schedules Manage transfer of physical and legal custody of all types of records Systematically collect, create, and manage lifecycle data about records Actual transfer, inspection, and archival storage of electronic records

What Does the Search & Access ERA Do? v. Focus: q Presidential Electronic Records

What Does the Search & Access ERA Do? v. Focus: q Presidential Electronic Records q George v. Functions: q Rapid W. Bush Presidential Library ingest of very large volumes of electronic records q Automatic indexing on ingest q Immediate searchability, based on index q Creation of different versions to support structured search of priority records q Basic case management for review and redaction of sensitive content.

Search and Access Instance Development v Achieved Initial Operating Capability December 8, 2008 v

Search and Access Instance Development v Achieved Initial Operating Capability December 8, 2008 v LMC proposed and received NARA and EOP agreement on an expedited method for transfer of electronic records. v NARA has enjoyed excellent collaboration from the EOP. v NARA implemented a contingency plan for access to high priority e-records, the finding aid for WH paper records and the database of digital photography, pending completion of processing into ERA.

EOP Transfer & Ingest Overview SAN A Storage Arrays SAN B 1 ARMS (PRA)

EOP Transfer & Ingest Overview SAN A Storage Arrays SAN B 1 ARMS (PRA) = 1. 9 TB Merlin One = 36 TB WARDS =. 018 TB RMS = 1. 0 TB Exchange = 57 TB WARDS (delta) = 0. 001 TB PDS = 0. 0005 TB ARMS (FRA) = 5. 1 TB PDS (delta) = 0. 0005 TB Non-Pri Types = 20 TB Non-Pri Types = 0. 2 TB Exchange Data Type SW Drops 6. 0 SAN B 2 7. 0 Snap Server ARMS (SAN) RMS (Update) 7. 1 PDS 12/8 SASS Operations (Ingest) 12/5 (IOC) Merlin One 2 = 36 TB WARDS Merlin One RMS 7. 2 26 November 2020 SAN A 2 2/11 SAN B Returns 12/12 1/26 12/15 1/20 1/30 RMS ARMS (PRA) PDS WARDS PDS (delta) Exchange WARDS (delta) Merlin One 11

G. W. Bush Presidential Electronic Records Number of objects Gigabytes of Data Shipped to

G. W. Bush Presidential Electronic Records Number of objects Gigabytes of Data Shipped to ERA Data Center Status Priority Records Email (2000 -2003) MS Exchange email (2003 -2008) Presidential Diary digital photography Index to White House paper records Visitor and worker access to EOP buildings Index to motion video Email from WH Counsel Other Records 44, 815, 184 1, 688 12/8/2008 >99% available for search in ERA. There are technical problems with the remaining messages. 150, 000 estimated 16, 500 estim ated Expected mid May In temporary storage. Conversion to standard format, separation from federal records, and identification of responsible EOP component largely complete. 682, 193 1 12/8/2008 and 1/26/2009 100% available for search in ERA 11, 220, 044 31, 000 1/26/2009 Problems require shipment of a second set, expected in mid May 313, 850 583 1/26/2009 100% available for search in ERA, but about 6% of the records appear to be missing some pieces of data. 28, 922, 988 14 12/8/2008 and 1/26/2009 100% available for search in ERA 305 5 1/26/2009 In ERA, being processed 572, 051 1, 057 1/26/2009 In ERA, being processed >12, 000 >5, 450 Partial shipment 1/26/2009 Some in ERA, being processed. Remainder expected mid May

Processing Status - 1 v All Bush e-records have been transferred to NARA’s custody.

Processing Status - 1 v All Bush e-records have been transferred to NARA’s custody. q q Not all have been transferred to the ERA Data Center in WV. EOP is maintaining copies until NARA successfully completes ingest. v Archives Operational Issues q q q Several sets of records were not transferred in the formats previously agreed by NARA and EOP o NARA required retransmission Some records exhibited anomalies o Some ARMS email records had binary data in the “To” field o Some metadata in the digital photography system did not have corresponding images. o Some entries in the Records Management System are missing some fields. o MS Exchange email was not divided presidential from federal records or associated with EOP component, and contained numerous duplicates. § EOP is addressing these problems prior to transfer to ABL. § EOP has converted from proprietary to standard format. § NARA will preserve both the original files and the output of the EOP processing. o Encoding of date of birth in the Access system impeded searches on that field. Viruses have been found in a small percentage of files. o Infected files have been successfully quarantined. LMC & NARA are working to produce clean copies.

Processing Status - 2 v Technical Issues q q q Issues with COTS products:

Processing Status - 2 v Technical Issues q q q Issues with COTS products: o Automatic indexing of a batch of records stops when errors are found in any of the records; e. g. , binary data in headers of email. o Erroneous results returned in certain conditions o Incomplete search results returned in other cases. o LMC underestimated storage space needed for the index. Additional hardware has been ordered. Unanticipated software development needed to ensure complete and accurate mapping between ‘. eml’ email produced by the EOP and the original MS ‘. pst’ files NARA directed LMC to hire a subcontractor to perform actual ingest of records.

Status of Requests for Bush Records v 28 Requests for access as of March

Status of Requests for Bush Records v 28 Requests for access as of March 17, 2009 v Primarily for paper records q NARA has responded using data about the paper records in the Records Management System v A few requests were for digital photographs. v Most requests were addressed using the two systems NARA set up under the Contingency Plan because processing of the records had not been completed at the time the requests were received. v Three requests fulfilled using records on temporary ERA storage.

Plans for Further Development

Plans for Further Development

What’s in Store for the Future? v Increment 2 q q q Preservation Framework

What’s in Store for the Future? v Increment 2 q q q Preservation Framework o Introduction and use of a variety of tools for different preservation needs Public access o Information about all types of records o Online access to electronic records Initial system evolution v Increments 3 - 5 q q q q Incremental enhancements in capability & capacity Continuing system evolution Governmentwide expansion Full Lifecycle Management Plans Appraisal case management and workflow Search Framework supporting different tools FOIA and other access case management Review and redaction of sensitive content

ERA Functional View: Current Status Agencies White House Base Instance EOP Instance Enterprise Service

ERA Functional View: Current Status Agencies White House Base Instance EOP Instance Enterprise Service Bus System Management Help Desk Shared Services Network Data Management

ERA Functional View: Planned Agencies Base Instance White House Committees Agencies EOP Instance Congressional

ERA Functional View: Planned Agencies Base Instance White House Committees Agencies EOP Instance Congressional Instance Records Center Instance Enterprise Service Bus System Management Network Current capability: solid fill Future capability: hashed fill Data Help Management Desk Shared Services Preservation Framework Public Access

ERA Instances v Base Instance (June 2008) Used by NARA and federal agencies q

ERA Instances v Base Instance (June 2008) Used by NARA and federal agencies q For management of all federal records q For transfer, inspection and management of federal electronic records v EOP instance (December 2008) q Used by NARA and Presidential Administrations q For transfer, inspection, and management of presidential electronic records v Congressional Instance (future) q Used by NARA for Congressional Committees q For transfer, inspection, and management of presidential electronic records v Federal Records Center Instance (future) q Used by NARA and other federal agencies q For transfer and storage of temporary and permanent federal electronic records that remain under the control of the originating agency q

ERA Shared Services v System Management q q q System operation and maintenance Security

ERA Shared Services v System Management q q q System operation and maintenance Security User account management Deployment of new & updated software Backup & other common services v Help Desk q (current) Respond to technical questions and issues from users v Network q q Link to the Internet, NARANET Interfaces with other systems (current) (future) v Data Management q q q Data about records and transactions related to them Description of NARA holdings Review and redaction of records with restricted content v Preservation Framework q Tools to overcome obsolescence of different digital formats v Public Access q q Search and retrieval of information about records, regardless of custody Search and access to electronic records in NARA’s custody Search and access to digitized records from NARA’s holdings Freedom of Information Act for restricted records in NARA’s custody (current) (Increment 2) (future) (Inc. 2 +)

Advantages of the Instances & Shared Services Approach v Instances enable different business rules

Advantages of the Instances & Shared Services Approach v Instances enable different business rules and processes for different mission requirements: Base Instance: Federal Records Act provisions on governmentwide records management and on the National Archives q EOP instance: Presidential Records Act q Congressional instance: House and Senate rules. q Federal Records Center Instance: Federal Records Act provisions on storage of temporary and permanent records under originating agencies’ authority. q

Advantages of the Instances & Shared Services Approach v Shared services maximize utilization of

Advantages of the Instances & Shared Services Approach v Shared services maximize utilization of resources, reduce redundancy and provide a stable foundation for system growth and evolution over time. v Shared services deliver capabilities and capacity wherever needed, regardless of differences in mission and business needs q q E. g. the Preservation Framework can be used to preserve any electronic records, regardless of whether they came from Congress, the White House or a federal agency. E. g. , a citizen seeking access to information will be able to find it using a single web portal, regardless of whether o It is information about records or in the records, o the records are in NARA’s physical custody, o the records are electronic or hard copy, o they originated in the White House, Congress or an agency.

Preservation Record Identity Record Integrity Original Order Electronic Record 1’ Electronic Record 2’ …

Preservation Record Identity Record Integrity Original Order Electronic Record 1’ Electronic Record 2’ … Electronic Record 1 Electronic Recordn Preservation Framework Tool 1 Tool 2 … Tooln Electronic Recordn’ The Preservation Framework supports the introduction and use of an arbitrary number and variety of processes under the control of archival requirements for authenticity.

Public Access v Information about all records From Records Schedules q Archival Descriptions q

Public Access v Information about all records From Records Schedules q Archival Descriptions q Other NARA information q v Online access to electronic records v Online access to scanned versions of hard copy records v Requests for copies of records v Freedom of Information Act requests for restricted records v Assistance from NARA staff

Increment 3 Work Status v Authority to Proceed Issued for Early Analysis Architectural Framework

Increment 3 Work Status v Authority to Proceed Issued for Early Analysis Architectural Framework q Preservation examination and prototyping q Search Engine examination and selection q Open Access examination and selection q Enhancements to address authorized user defined changes and software defects not addressed at IOC q v Discussions begun on scope of work and technical details for full proposal v Target date for award: 7/09

Governmentwide Expansion v Initial Implementation June 2008 – June 2009 q Four collaborating agencies

Governmentwide Expansion v Initial Implementation June 2008 – June 2009 q Four collaborating agencies q NARA staff proxy for other agencies q v Invitational Phase June 2009 – February 2010 q Additional agencies by invitation q v Voluntary Phase February 2010 – December 2010 q Additional agencies who volunteer and meet critera q v Mandatory Phase January 2011 q All agencies q

The Development Timeline 9/05 9/06 9/07 9/08 9/09 9/10 9/11 6/08 ERA Base Initial

The Development Timeline 9/05 9/06 9/07 9/08 9/09 9/10 9/11 6/08 ERA Base Initial Operating Capability) Search & Access ERA Public Access & Preservation Framework Enhancement Full Operating Capability Operation & Maintenance