Address coverage and data quality per country
Address coverage and data quality
The International Address Checker is fed with international address data. These come from various sources in the country of origin, such as government agencies and local postal authorities. The degree of address coverage and data quality is therefore country-dependent.
The degree of address coverage indicates per country what percentage of the total number of addresses the International Address Checker has. However, this is an abstract fact and not decisive for the number of successful address validations that the International Address Checker can perform. For example, it can happen that more successful validations are performed in a country with relatively low address coverage than in a country with relatively high address coverage. This depends on various factors, such as the degree of urbanization, the state of the infrastructure and the general wealth of a country. To gain more insight into this, please contact us.
The data quality refers to the accuracy of the data. This means that the data must be reliable in order to draw conclusions. We use the following five criteria for this, each of which is included with its own weighting factor:
Quality of the source
Ideally, all necessary address data is available from one official source, such as the government or a similar public agency. In practice, this is not always the case, for example when data comes from multiple providers or when the source is not a government agency.
Data parsing requirements
It may be necessary to structure the raw data. We achieve this through parsing, normalization and cleaning. With parsing we translate combined elements into individual address elements such as street, house number and house number addition. With normalization and cleaning we apply a uniform language use, such as making different abbreviations uniform and correcting spelling mistakes.
In some cases missing data needs to be enriched, such as adding a street name in an individual row or adding latitude and longitude coordinates in an entire address data set.
Data completeness score
This score reflects the number of available address elements within a dataset and their importance to achieve optimal delivery and address validation.
Business vs residential address score
This score indicates the extent to which a dataset contains an indicator that indicates whether an address is a residential address and/or a business address.