GeoPlace
Printable version

Working with third-parties to enrich data quality

Blog posted by: Andrea Bollella, Priority Third Party Data Manager at GeoPlace, 31 March 2023.

For over a decade now, GeoPlace has been the central hub which has allowed the collection of all addressing and street information into one place. With millions of updates received monthly from more than 500 authorities in England and Wales, it has become inevitable that using additional sources of information, intended only as validation intelligence, has allowed the National Address Gazetteer (NAG) and the National Street Gazetteer (NSG) to thrive even further. The NAG isn’t available as a product, rather it’s the ‘raw data’ that is transformed into the AddressBase range of products and made widely available by Ordnance Survey.

For those of you who are unfamiliar with what the NAG and the NSG are, in the simplest of terms they are the digital directory where all addressing, and street information are collated. In the good old days, we had books where all roads and addresses were being kept. See below an example of what the NAG/NSG used to look like from the lovely town of Woking, Surrey where I was an authority address custodian.

Historic road and address data

Local authorities around the UK have the statutory obligation to name streets and number buildings to allow services to be provided to its citizens. They also assign an identification number so that information can be provided a lot faster in the digital world we live in. The UPRN (or Unique Property Reference Number) and the USRN (or Unique Street Reference Number) are the keys that hold various information about streets and addresses.

The most prominent of such information is the primary, secondary and tertiary classification. Each building can be identified by its primary or intended purpose. For example, a house can be classified as a semi-detached Residential Dwelling, and it is assigned a code (i.e., RD03) which is stored against its UPRN. The same happens for a street which can be classified as a Motorway, for example, and it is assigned a code (i.e., M) which is stored against its USRN.

These elements are considered a priority for inclusion in the gazetteer by users of the data because they allow better services to be provided. The users are emergency services, public and private organisations, and many more. Therefore, it is important that these are captured accurately and have the correct classification.

There are also other organisations that hold data relevant to their own operations. For example, the Care Quality Commission has their own register of medical establishments, as they regulate all health and social care services in England. The Gaming Commission that keeps a public register of all licensed premises in Great Britain is another example. The Department for Transport that is home to the National Public Transport Access Nodes (NaPTAN) and the National Public Transport Gazetteer (NPTG) datasets, as well as many others.

GeoPlace has included these additional data sources within the NAG to complement the work undertaken by local authorities and there is positive evidence that cross checking data against the NAG or the NSG has benefited users of the data in many ways. We consume data from a variety of sources for this work, including:

Working with third parties to enrich data quality

Why do we need third-party data?

To make the addressing data richer and more valuable, providing a high level of accuracy on addressing data for users of the data. - Nowadays, data accuracy enables better decision-making. If data quality is high, the users will be able to produce better outputs. This is increasingly evident when emergency services are trying to respond to an emergency call. The knowledge of the type of building that they are soon approaching may assist in vital procedures to get the best outcome. Sometimes it could be the factor determining life or death.

How is GeoPlace analysing the data?

We identify all the addressing and street information provided to us from the data sources, and we perform a matching exercise against the NAG/NSG. This allows us to see the matching rate. In most case scenarios, if the data source fits certain criteria, we get a match rate that varies from 75% to 92%. This is possible as long as the data structure contains the following:

  • house number or name
  • street name
  • post town and/or locality
  • and a postcode.

Through a series of matching exercises, we can identify the UPRNs/USRNs associated with the data and analyse the classifications. If the matches provide the correct UPRNs (for the specific addresses) with the correct primary, secondary and tertiary classifications, then we move the datasets towards the final product. If the matches provide the correct UPRNs (for the specific addresses) but the primary, secondary and tertiary classifications are not what we expect them to be, then we refer them to the relevant local authority for additional feedback or to make sure the classification is amended as required. If we are unable to match a UPRN (for the specific address), then we ask the relevant local authority to establish why the address might be missing, although this last scenario is quite rare.

Third party data life cycle

In order to highlight the importance of correct classification for UPRNs, especially during the covid pandemic, we have put together an internal case study based on primary, secondary and tertiary classification. The work that was undertaken to support the COVID response has left a longer legacy of connected data that can be reused onto the future.

We looked at the number of UPRNs classified as care homes per local authority and the number of residential units per care home in England and Wales. After a thorough analysis in the NAG, we found that Birmingham has the most with 959 UPRNs classified as care homes while the City of Westminster, for example, has only 17 UPRNs classified as care homes.

Number of care homes per local authority, and number of residential units per care home

Number of care homes per local authority, and number of residential units per care home

GeoPlace will continue to investigate potential third party data suppliers and work with local authority Custodians to ensure they are included in the NAG, and flow through to all users thanks to the NSG and AddressBase range of products.

 

Channel website: https://www.geoplace.co.uk

Original article link: https://www.geoplace.co.uk/blog/2023/working-with-third-parties-to-enrich-data-quality

Share this article
Home About Addressess Streets Helpdesk News & Events Exemplar Consultancy

 

Latest News from
GeoPlace

Spotlight on women at Serco – Anita’s story