National Archives
Printable version E-mail this to a friend

PRONOM database expands thanks to international partnership

The National Archives' award-winning PRONOM service, an online registry of file formats which is crucial to digital preservation, has been able to significantly expand its database thanks to a successful partnership with Georgia Tech Research Institute and The National Archives and Records Administration (NARA) in the United States.

How does PRONOM work?

PRONOM is an online database containing details of more than 750 different digital file formats. Along with the DROID file format identification tool, which uses the database, PRONOM enables digital archivists, records managers and anyone using the tool to find out what files they have, in which formats and how best to ensure their long-term preservation.

The DROID tool scans a computer or hard drive and identifies files either through its file extension (for example .doc for Word files) or by matching the file's internal signature with specific entries in the PRONOM database. Internal signatures are a far more accurate way of identifying file formats, as extensions can be easily changed or deleted.

Currently around a third of the 750 entries in the PRONOM database have internal signatures. The more signatures there are in the database, the more accurate the DROID tool is at identifying files, and the greater its use to the digital preservation community. Positively identifying file formats is the first step to ensuring their long-term preservation, since it enables digital archivists to identify older file formats which may be in danger of becoming obsolete. It also allows them to develop migration strategies and preservation tools to deal with them.

Ongoing collaboration

The research work done by the Georgia Tech Research Institute, at the request of The National Archives Center for Advanced Systems and Technologies (NCAST), has increased the number of internal signatures in the PRONOM database by almost a quarter. The ongoing collaboration with The National Archives in the UK means that over 50 signatures have been added to the database this month, with more expected next year. Kenneth Thibodeau, Director of the Center for Advanced Systems and Technology at the National Archives and Records Administration, commented: 'In PRONOM/DROID, The National Archives of the UK has responded to an essential need for preserving and providing sustained access to valuable digital information. We are happy to be able to contribute to enhancing a tool that we use in NARA's Electronic Records Archives system. This helps us and also benefits anyone who needs to preserve digital assets.'

David Thomas, Director of Technology at The National Archives, said: 'We are grateful to NARA and Georgia Tech Research Institute for the work they have recently undertaken on file format research. The decision to share their work with The National Archives here has significantly improved the PRONOM database and will be of enormous benefit to the wider digital preservation community.'

The history of PRONOM

The first version of PRONOM was developed by The National Archives' Digital Preservation department for internal use in March 2002 and was launched as a free online service to the public in February 2004.

In 2007 The National Archives won the prestigious Digital Preservation Award for its development of the PRONOM and DROID tools, in recognition of its significant contribution to digital preservation.

DROID and PRONOM have been incorporated into a number of archival digital repository systems, including those used by The National Archives of Estonia, Finland, Austria and Switzerland.

DROID has also been used by a number of UK government departments as part of their information assets and records management processes, with the assistance of The National Archives' Digital Continuity team.

The future of PRONOM

2011 will see the release of PRONOM data in a linked, open format. This will make it easier for others to reuse the data, and provide the means to extend and develop the dataset. Find out more on our Labs site.



The PPM Benchmarking Report 2019...find out more and download now...