Almost every business or organization has a database of customers with addresses, a list of stores, place names, regions etc. To display this type of data on a map we must transform the addresses into geographic coordinates. The ArcGIS World Geocoding Service provides high quality geocoding for place names, addresses and zip codes. You can call the geocoding service by using ArcGIS products including ArcGIS Online, ArcGIS Pro and ArcGIS API for Python. I often get asked by users on how to call the service directly from R. This is particularly relevant for users who use R in their day to day work and would like to integrate geocoding and spatial analysis into their R scrips.
As some of you may already know, the R-ArcGIS Bridge is the R integration for ArcGIS Pro that makes it possible to combine the spatial analysis and mapping functions of ArcGIS with the statistical modeling capabilities of R. The related R package developed by our R-ArcGIS Bridge team is called arcgisbinding. Over the past year there had been quite a few new enhancements made to the arcgisbinding package. For example, R-ArcGIS Bridge now supports access of remote data in the cloud. You can also use R Notebooks inside an ArcGIS Pro Conda environment. My colleagues Orhun Aydin and Nicholas Giner wrote a nice blog article that covers some of the new features in arcgisbinding. One of the new features showcased is how to import ArcPy in a R Notebook. This allows running geoprocessing tools directly from R side-by-side with R code.
Today’s blog article will focus on RStudio users. I would like to show you an example of how to leverage the ArcGIS World Geocoding Service by importing the ArcGIS API for Python package in RStudio to batch geocode a list of addresses from a csv file.
In this example, we will explore how to:
- Import ArcGIS API for Python in RStudio.
- Convert a list of addresses in a csv file to coordinates by calling the ArcGIS World Geocoding Service available in ArcGIS API for Python library from RStudio.
- Convert the geocoding results from R to a local geodatabase feature class using R-ArcGIS Bridge when further mapping and spatial analysis is desired in ArcGIS Pro.
The data to be geocoded are in a csv file that contains 2000 randomly selected US addresses. We will use the address information from the SingleLine column to batch geocode all addresses.
Setting Up
The process begins with setting up R-ArcGIS Bridge on a computer where R, RStudio, and ArcGIS Pro are installed. Note: ArcGIS API for Python is included with the ArcGIS Pro install. R-ArcGIS Bridge can be set up using the R-ArcGIS Support option in ArcGIS Pro Geoprocessing tab under Options.
Install Required Packages and Libraries in RStudio
Once the R-ArcGIS Bridge is set up, we start RStudio and load the arcgisbinding package together with other packages. The function arc.check_product() binds the RStudio session to the ArcGIS Pro installation.
Notice that one of the R packages that we installed and imported into RStudio is reticulate. Reticulate is the R interface to Python. It includes facilities for calling Python from R such as importing Python modules and using Python interactively within an R session.
It is possible that you may have several Python versions installed on your machine. Here we use the function use_python() to specify the path of the Python binary available with ArcGIS Pro.
After loading the reticulate package in RStudio, we can install the ArcGIS API for Python (“arcgis”) into a virtual environment using conda_install.
Once it is installed, we will import “arcgis” which is the ArcGIS API for Python library.
Perform Batch Geocoding in R
An ArcGIS Online organizational account is required to use the batch geocoding functionality provided by the World Geocoding Service. We need to sign in to the ArcGIS Online organization and load the batch_geocode function from the arcgis$geocoding module.
Next, we will read the csv file (USA_2k_random.csv) and specify the column (SingleLine) in the csv file to be used for addresses for the batch geocoder. Then we run the batch geocoding function, the geocoding results are returned as JSON for each record with fields including Address, Location, Score and Attributes. The output fields are described here.
Read Geocoding Results to a R Data Frame and SF Object
The following R code snippet uses functions available in R tidyverse packages to parse through the results returned from geocoding and create a simple data frame in R to contain each record with geographic coordinates and the address filed. Then a R spatial sf object is created based on the populated data frame.
We can simply plot the Result_sf in R and run various R functions to analyze the data.
Write Results to ArcGIS
Some times you may want to take advantage of the advanced cartographic capability in ArcGIS Pro to map and visually evaluate the R analysis results with spatial data. In addition, ArcGIS provides a comprehensive set of spatial analysis methods and algorithms to supports site selection, clusters analysis, and space time pattern mining. The last step in our example is to use the R-ArcGIS Bridge to convert the geocoding results to a local feature class for further analysis in ArcGIS Pro.
I hope this example inspires data scientists who primarily work with R to leverage ArcGIS World Geocoding Service. Remember with R-ArcGIS Bridge, you can also leverage functionality from ArcGIS Pro directly.
As always, I would love to hear your experiences and comments!
Article Discussion: