Online research project announced to mark World Digital Preservation Day

7 Nov 2019 12:16 PM

Today we mark World Digital Preservation Day by announcing an online research week looking at ways to make our web-based technical registry PRONOM more accessible.

PRONOM currently holds information on almost 1,700 distinct file formats. This includes different versions of the multiple Microsoft Word document types that have existed, for example. However, almost 600 entries in the registry contain only a brief description of the format. Furthermore, 400 formats do not have a file signature, which are based on the internal structure of the file and allow us to identify the format with a high degree of certainty.

A key step in preserving digital content for the future is understanding what file format the information was stored in so that we can then identify appropriate software to access that information. If the software is becoming hard to obtain, we can re-save the information in a new format for easier access. This identification is undertaken using tools such as our DROID (Digital Record Object Identification) software. These tools are integrated into all commonly available digital preservation systems.

Digital preservation staff at The National Archives have been involved in file format research for almost 20 years and PRONOM has been publicly available for 15 years. The virtual research week will focus on improving the PRONOM file format registry and be coordinated through a GitHub repository and our online PRONOM discussion group.

We would particularly welcome contributions that help us to fill out online file format entries, perhaps adding links to the software company that created the format, providing sample files for different formats, or helping to develop format signatures. The research week will be held between Monday 18 and Friday 22 November.

For more details on contributing, please visit the GitHub repository and for more information on PRONOM, click here.