ArcGIS Blog

Analytics

Geocoding: Delivering High Location Accuracy

By patel_nick

Geocoding is a fundamental GIS process for plotting your data on the map. It is often the first step in applying GIS to understand the where. The accuracy of plotted points directly correlates with the success of downstream decisions. Regardless of the GIS application you choose, you will make decisions based on analysis and this analysis must be accurately geocoded.

To ensure geocoding accuracy, consider these five questions:

  1. How do you define accuracy and why does it matter?
  2. What are the benefits of using a solution built with authoritative, commercial street data versus a low-cost or free solution that uses OpenStreetMap and Census TIGER data?
  3. Which match levels are supported by my current geocoding solution, and what is the benefit of cascading to get the best match at the highest accuracy?
  4. What is the difference between a Address point, Delivery point, and an Address Range match?
  5. What does the status ‘Match, Tie, or Unmatch’ mean and what does the score indicate?

Defining Accuracy

Many people get confused when describing Accuracy and Precision. Sometimes these terms are incorrectly thought to be interchangeable.

Here are the correct definitions. Accuracy is used to describe the closeness of a measurement to a true value. Precision is the closeness of agreement among a set of results. In the graphic below, you can see the differences between each.

This diagram shows the difference between Accuracy and Precision. Geocoding matches can be highly precise and highly accurate as well as inaccurate and imprecise, and anywhere in between.  Image Credit: http://www.antarcticglaciers.org/

In geocoding, you can obtain highly accurate, highly precise results or results with low accuracy and precision, or anywhere along the spectrum depending on your solution. Accurate and precise geocoding results are essential for many use cases as discussed in our previous blog.  They greatly affect the decisions that stakeholders make with analysis downstream. Many factors influence accuracy and precision, as we will discuss.

Street Reference Datasets and Accuracy

To be able to match a spatial coordinate to your address record, you need a street reference dataset that contains the address and the associated coordinates. This will be used by the software as a reference to match your input address to a coordinate. There are three street reference data options:

  • Publicly available sources like US Census TIGER, Local/National government addressing data.
  • A commercial street dataset such as those from HERE and TomTom.
  • A street dataset assembled through contributions by the vendor’s own user community (e.g. Google).

Each of these sources has pros and cons related to accuracy. How do you determine which is right for you?

This table shows the pros and cons of each data source.

Cascade Matching

The next aspect of accuracy and precision is about software and the ability to leverage data at multiple layers of precision.

The most highly accurate and precise geocode you can return is at the address point or rooftop and delivery point location. The benefit of a geocoding solution using a commercial dataset is that you get access to multiple levels of precision to match, including address points.

Another benefit is that software vendors who typically leverage commercial datasets also allow users to match at different levels of precision. Results are highly accurate. This feature is called cascade matching. For example, if the user submits an incomplete address without the building number, the software will recognize it. Instead of returning an unreliable, inaccurate match at address point, the software will cascade down to a street name centroid match that is more reliable and accurate.

This diagram shows the different levels of geographic precision. Geocoding software vendors often have logic that allows the input address to match at the higher level of precision first and if a reliable match cannot be found, the query falls back to the next best precision.

Address Point, Delivery Point, and Interpolation

With commercial street datasets, there are usually two points that are returned as part of an address match:

  • Address point (display x, display y), or Display Point, which is used for centering the map around the location and for other high location accuracy use cases. These point locations are often centered on the rooftop or the parcel centroid.
  • Road Adjusted point (x, y), or Routing Point, which is used for routing and network analyses use cases.

Both of these points provide highly accurate matches and cannot be obtained using TIGER or crowd-sourced data in a consistent fashion.

In addition to highly accurate and precise matches, commercial street datasets also provide Street Address Range data including street centerlines with range information. The software uses this to interpolate or approximate the match location along a ranged street segment. While less accurate than the Address Point matches, it delivers high location accuracy (within 10 to 50 meters) to meet most use cases in the market.

This diagram shows an input address, ‘240 A Street’ on a street range of 200-299, and shows the Address Point (Display Point) Match, Range Interpolated Match, and Routing Point Match. Often, the distance between the Range Interpolated point and the Routing point is within 10 to 50 meters.

Match Quality Indicators

A good geocoding solution also outputs the match accuracy and confidence on the precision of that match along with the matched coordinate. A match without these indicators is often unreliable for analysis or decision making. A good solution should indicate:

  1. Match status: Whether the input yielded candidate matches to a coordinate in the database or if there are multiple, tied candidates at the same level of precision requiring user review, or if the input was unmatched.
  2. For those records that are matched or tied, a Confidence score or Match score will assess the confidence on the precision of a match.
  3. Accuracy of the match: Whether it was down to address point, interpolated point, street centroid, postal centroid, or city centroid.
This is a geocode output table after running a series of addresses through the Esri World Geocoding capability. The Match_addr, Score, and Addr_type are appended to the input address fields. Additional fields are also appended such as the Match Status, but not shown in this consolidated view.

Geocoding location accuracy depends on many factors. Hopefully, by asking the questions outlined here, you will be able to choose a solution that best supports your use cases, analysis, and downstream decision making. Esri considers and incorporates such aspects into building its own Esri World Geocoding capabilities and products to support the work of users across the globe in all industries. To learn more about geocoding, please refer to our previous blog or visit our webpage.

Share this article

Subscribe
Notify of
0 Comments
Oldest
Newest
Inline Feedbacks
View all comments