ArcGIS Blog

Announcements

Developers

Geocode from R using {arcgisgeocode}

By Martha Bass and Josiah Parry

With the new package {arcgisgeocode}, which is part of the {arcgis} metapackage and the R-ArcGIS Bridge, you can now geocode directly from R! Geocoding is the process of identifying the location associated with an address. {arcgisgeocode} enables users to do this – and so much more – with functions for address candidate identification, batch geocoding, reverse geocoding, and autocomplete location suggestions. You can use the ArcGIS World Geocoding Service or even make use of custom locators created in ArcGIS Pro, ArcGIS Online, and ArcGIS Enterprise, as shown in the example below. Let’s take a quick look at how you can use {arcgisgeocode} to locate bike shops in Tacoma, Washington.

Install and load the package

In your integrated development environment (IDE) of choice, install and load {arcgisgeocode}:

# install arcgisgeocode
install.packages("arcgisgeocode")

# load the package
library(arcgisgeocode)
You will also need the {arcgisutils} package in order to authorize with an ArcGIS portal. Follow the workflow in the linked documentation if you have not authorized using {arcgisutils} before.

# load arcgisutils
library(arcgisutils)

# set the token for authorization
set_arc_token(auth_user())

Read in bike shop addresses

First, load a non-spatial table of bike shop locations. Add a field containing row numbers, which will serve as an ID for joining the inputs back to the geocoded results. When using geocode_addresses(), the join field is named result_id. For find_address_candidates(), the join field is input_id.

# load the bike shop addresses
csv <- "https://www.arcgis.com/sharing/rest/content/items/9a9b91179ac44db1b689b42017471ae6/data"
bikeshops <- readr::read_csv(csv) |>
  # add result_id field
  dplyr::mutate(
    result_id = dplyr::row_number()
  )
bikeshops
#> # A tibble: 10 × 3
#>    store_name                           original_address               result_id
#>    <chr>                                <chr>                              <int>
#>  1 Cascadia Wheel Co.                   3320 N Proctor St, Tacoma, WA…         1
#>  2 Puget Sound Bike and Ski Shop        between 3206 N. 15th and 1414…         2
#>  3 Takoma Bike & Ski                    3010 6th Ave, Tacoma, WA 98406         3
#>  4 Trek Bicycle Tacoma University Place 3550 Market Pl W Suite 102, U…         4
#>  5 Opalescent Cyclery                   814 6th Ave, Tacoma, WA 98405          5
#>  6 Sound Bikes                          108 W Main, Puyallup, WA 98371         6
#>  7 Trek Bicycle Tacoma North End        3009 McCarver St, Tacoma, WA …         7
#>  8 Second Cycle                         1205 M.L.K. Jr Way, Tacoma, W…         8
#>  9 Penny bike co.                       6419 24th St NE, Tacoma, WA 9…         9
#> 10 Spider's Bike, Ski & Tennis Lab      3608 Grandview St, Gig Harbor…        10

Access the custom locator

Since the goal is to map bike shops in Tacoma, let’s use the City of Tacoma’s custom locator for geocoding. (Be sure to review the City of Tacoma’s disclaimer first.)

# create a GeocodeServer object for the Tacoma custom locator
tacoma_gc <- geocode_server('https://gis.cityoftacoma.org/arcgis/rest/services/Locators/Tacoma_Geocoding_Service/GeocodeServer')
tacoma_gc
#> <GeocodeServer>
#> Version: 10.81
#> CRS: 2927

Geocode the bike shops

To derive locations from the bike shop addresses, pass the original_address column to the batch geocoding function, geocode_addresses(). Then, join the results back to the original shop names and addresses.

# geocode
candidates <- geocode_addresses(
   single_line = bikeshops$original_address,
   for_storage = TRUE,
   geocoder = tacoma_gc
   )

# reformat geocoded output
candidates <- candidates|>
  # select the relevant geocoding result columns
  dplyr::select(c(result_id:addr_type, geometry)) |>
  # merge back to bikeshops
  dplyr::left_join(bikeshops) |>
  # rearrange columns for readability
  dplyr::relocate(store_name:original_address, .after = result_id)

# view the results
dplyr::glimpse(candidates)
#> Rows: 10
#> Columns: 11
#> $ result_id        <int> 1, 3, 4, 5, 7, 6, 2, 8, 10, 9
#> $ store_name       <chr> "Cascadia Wheel Co.", "Takoma Bike & Ski", "Trek Bicy…
#> $ original_address <chr> "3320 N Proctor St, Tacoma, WA 98407", "3010 6th Ave,…
#> $ loc_name         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ status           <chr> "M", "M", "M", "M", "M", "U", "M", "M", "U", "M"
#> $ score            <dbl> 100.00, 100.00, 100.00, 100.00, 100.00, 0.00, 95.68, …
#> $ match_addr       <chr> "3320 N PROCTOR ST, Tacoma, WA, 98407", "3010 6TH AVE…
#> $ long_label       <chr> "3320 N PROCTOR ST, Tacoma, WA, 98407", "3010 6TH AVE…
#> $ short_label      <chr> "3320 N PROCTOR ST", "3010 6TH AVE", "3550 MARKET PL …
#> $ addr_type        <chr> "PointAddress", "PointAddress", "PointAddress", "Poin…
#> $ geometry         <POINT [US_survey_foot]> POINT (1146741 715552.6), POINT (1149922 706941.5), P…

Assess the candidates

Geocoded results return a status and score for each location, which help with evaluating the success of the geocoding operation. The status of a location can be M (matched), U (unmatched), or T (tied). Let’s see if any bike shop addresses were unmatched or tied.

# check for any status other than matched
candidates |>
   dplyr::filter(status != "M"]) |>
   dplyr::glimpse()
#> Rows: 2
#> Columns: 11
#> $ result_id        <int> 6, 10
#> $ store_name       <chr> "Sound Bikes", "Spider's Bike, Ski & Tennis Lab"
#> $ original_address <chr> "108 W Main, Puyallup, WA 98371", "3608 Grandview St,…
#> $ loc_name         <chr> NA, NA
#> $ status           <chr> "U", "U"
#> $ score            <dbl> 0, 0
#> $ match_addr       <chr> NA, NA
#> $ long_label       <chr> NA, NA
#> $ short_label      <chr> NA, NA
#> $ addr_type        <chr> NA, NA
#> $ geometry         <POINT [US_survey_foot]> POINT EMPTY, POINT EMPTY

Two bike shop addresses were not matched to a location in Tacoma. Upon reviewing the locations, these correspond to bike shops that are outside Tacoma city limits. You can remove these locations from your results. Then, view the score values assigned to all of the matching locations.

# only use the matched locations
candidates <- candidates[candidates$status == "M", ]

# view the geocoding scores
candidates$score
#> [1] 100.00 100.00 100.00 100.00 100.00  95.68 100.00 100.00

The scores look good – a score of 100 indicates a perfect match. You can move on to visualizing these results.

Learn more about geocoding results here.

Visualize the results

Creating a hosted feature layer in ArcGIS Online is a great way to display and share these results with others. You can use {arcgislayers} to do this directly from R. Load the package and then pass the sf object, candidates, to the publish_layer() function.

# load arcgislayers
library(arcgislayers)

# publish the geocoding results as a hosted feature layer
res <- publish_layer(candidates, "Tacoma Bike Shops")
res$services$serviceurl
#> [1] "https://services1.arcgis.com/hLJbHVT9ZrDIzK0I/arcgis/rest/services/Tacoma Bike Shops/FeatureServer"

Now you can explore the published data in ArcGIS Online. From there, you can create and share web maps and apps, like this ArcGIS Dashboard:

Share this article

Subscribe
Notify of
0 Comments
Oldest
Newest
Inline Feedbacks
View all comments

Related articles