Enhancing Business Register: Integrating Traditional and Alternative Data Sources through Record Linkage Methods

Conference: ICES VII
06/18/2024: 3:40 PM - 4:00 PM BST
Topic Contributed Session 


Traditional data sources in official statistics production, such as surveys, are being complemented or, in some cases, replaced by new alternative data sources. These alternative sources include scanner data, web-scraped data, and new registers. The use of such alternative data sources often requires linking the data to other more traditional sources. To link observations of the same unit from different data sources, unique identifiers for the given unit are essential. The absence of such identifiers can lead to issues that can be resolved through probabilistic record linkage methods.

Statistics Finland has quite recently faced the need to apply these methods in numerous cases, such as combining house sales advertisement data with transfer tax data and Energy Certificate register with the Building Register. Drawing from our experiences in these cases, we have expanded the use of these methods to new applications.

One practical application for record linkage methods is the linking of enterprises from the Business Register and their respective webpage domain names obtained from the Finnish Transport and Communications Agency (Traficom). This presents many challenges, including that these domain names are often associated with parent companies in the Traficom data, and the challenge lies in accurately pairing the appropriate domain name with the correct company within these parent entities.

The linkage was done by comparing enterprise and establishment names with the domain names with the help of different string comparison methods. Lack of sufficient auxiliary variables that could provide more information for the linkage proofed to be a big challenge. Additionally, a few problematic cases were identified very quickly: cooperatives, housing companies and building superintendents.
Despite our efforts, the results thus far have been less than promising. However, the work is still ongoing and different methods and data sources are being explored in order to enhance the Business Register in Finland.


Ville Auno, Statistics Finland


Katja Löytynoja, Tilastokeskus