ArcGIS Pro Charts provide a powerful way to visualize and explore your data, helping to uncover patterns, trends, relationships, and structure that might not be apparent when looking at a table or map. The Charts development team has been working to add chart capabilities to ArcPy, to support automation and spatial data science workflows using Python. It’s possible to do all your data preparation, visualization, and analysis completely inside ArcGIS—but you can also mix-and-match the ArcGIS Python libraries with your other favorite libraries.
This blog post demonstrates ArcPy Charts functionality by visualizing characteristics and trends of the COVID-19 pandemic in the United States during 2020. The raw code is available in code blocks below, but it is best viewed in notebook format, which is available to download on ArcGIS.com.
Preparing the Data
import pandas as pd
from arcgis.features import GeoAccessor
import arcpy
DATA_URL = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv'
# load data with pandas, create new fields, and filter
daily_df = (
pd.read_csv(DATA_URL, parse_dates=['date'])
.sort_values(['state', 'date'])
.rename(columns={
'cases': 'cases_total',
'deaths': 'deaths_total'
})
.assign(
cases_new = lambda df: df.groupby('state')['cases_total'].diff().clip(lower=0),
deaths_new = lambda df: df.groupby('state')['deaths_total'].diff().clip(lower=0)
)
.query("'2020-01-01' <= date <= '2020-12-31'")
.reset_index(drop=True)
)
Here’s a quick look at the prepared dataset. Notice that there is an individual row for each date and state combination. These rows will be summarized and aggregated when I visualize this data with charts.
date | state | fips | cases_total | deaths_total | cases_new | deaths_new |
---|---|---|---|---|---|---|
2020-03-13 | Alabama | 1 | 6 | 0 | NaN | NaN |
2020-03-14 | Alabama | 1 | 12 | 0 | 6.0 | 0.0 |
2020-03-15 | Alabama | 1 | 23 | 0 | 11.0 | 0.0 |
2020-03-16 | Alabama | 1 | 29 | 0 | 6.0 | 0.0 |
2020-03-17 | Alabama | 1 | 39 | 0 | 10.0 | 0.0 |
Currently, pandas data structures cannot be directly used to create an ArcPy chart—though we hope to add this ability in the future. Instead you must convert your data to a supported format such as a Layer or Table object, a dataset path, or a feature service URL. For this demo, I’ll save the pandas DataFrame as a CSV file and then copy the rows to an in-memory table:
daily_df.to_csv('covid_daily.csv')
arcpy.management.CopyRows('covid_daily.csv', 'memory/covid_daily')
Visualizing the Data
Now that I’ve prepared the data with the proper fields and saved it in a supported format, I can explore it using ArcPy charts!
First, I’ll start simple and create a bar chart showing the total COVID cases for each state. I do this by initializing an ArcPy Chart object and configuring the properties. As illustrated in the table above, the dataset contains one row for each state
and date
combination, so here I use a sum
aggregation to calculate the total cases_new
values for each state. Also take note of the dataSource
property, which is used to specify the dataset you want to visualize. Here I’m configuring the chart to use the CSV table I created above.
c = arcpy.Chart('bar_covid_by_state')
c.type = 'bar'
c.title = "Total COVID Cases by State"
c.xAxis.field = 'state'
c.xAxis.title= "State"
c.yAxis.field = 'cases_new'
c.yAxis.title = "Cases"
c.bar.aggregation = 'sum'
c.dataSource = 'memory/covid_daily'
c.exportToSVG('bar_covid_by_state.svg')
The chart above is a good first attempt, but it’s very difficult to read due to the small size. I’ll make the chart larger by setting the chart object’s displaySize
property. I’ll also arrange the bars in a more logical way by sorting them to be in descending order from most cases to fewest cases.
c = arcpy.Chart('bar_covid_by_state_desc')
c.type = 'bar'
c.title = "Total COVID Cases by State"
c.xAxis.field = 'state'
c.xAxis.title = "State"
c.yAxis.field = 'cases_new'
c.yAxis.title = "Cases"
c.yAxis.sort = 'DESC'
c.bar.aggregation = 'sum'
c.displaySize = 900, 400
c.dataSource = 'memory/covid_daily'
c.exportToSVG('bar_covid_by_state_desc.svg')
Now I’ll take a look at new cases per day for the entire United States by creating a bar chart with a date field on the X axis and the total aggregated daily COVID cases on the Y axis.
c = arcpy.Chart('bar_covid_daily')
c.type = 'bar'
c.title = "Total COVID Cases by Day"
c.xAxis.field = 'date'
c.xAxis.title = "Day"
c.yAxis.field = 'cases_new'
c.yAxis.title = "New Cases"
c.bar.aggregation = 'sum'
c.color = ['#fac9c7']
c.displaySize = 800, 500
c.dataSource = 'memory/covid_daily'
c.exportToSVG('bar_covid_daily.svg')
The above chart is helpful for understanding the trajectory of daily COVID cases in the US, but this chart is difficult to interpret due to the existence of noise in the dataset. As time progresses, you can see that the bars form many peaks and valleys, and this cyclical pattern is most likely due to inconsistent reporting of COVID cases. To combat this noise, I can re-create the same chart, but this time I’ll include a moving average line. Moving averages are useful for smoothing out noise in a temporal dataset and highlighting the general pattern of the data.
c = arcpy.Chart('bar_covid_daily_moving_avg')
c.type = 'bar'
c.title = "Total COVID Cases by Day"
c.xAxis.field = 'date'
c.xAxis.title = "Day"
c.yAxis.field = 'cases_new'
c.yAxis.title = "New Cases"
c.bar.aggregation = 'sum'
c.bar.showMovingAverage = True
c.color = ['#fac9c7']
c.displaySize = 900, 500
c.dataSource = 'memory/covid_daily'
c.exportToSVG('bar_covid_daily_moving_avg.svg')
I can also view aggregated COVID cases over time from a slightly different perspective by creating a calendar heat chart. This chart aggregates daily cases and displays them in a calendar grid. The calendar heat chart is effective at showing a per day summary of temporal data, particularly when the values are unevenly distributed, as the color for each cell is determined by a graduated natural breaks scheme.
c = arcpy.Chart('chc_covid_daily')
c.type = 'calendarHeatChart'
c.title = "Total COVID Cases by Day"
c.xAxis.field = 'date'
c.xAxis.title = "Day"
c.yAxis.field = 'cases_new'
c.yAxis.title = "Month"
c.calendarHeatChart.aggregation = 'sum'
c.displaySize = 900, 500
c.dataSource = 'memory/covid_daily'
c.exportToSVG('chc_covid_daily.svg')
Having visualized the daily COVID cases aggregated for the entire country, I may also be interested in comparing daily cases between states. To do this, I’ll create a line chart and split the data by the state
field. This creates a separate line for each state.
c = arcpy.Chart('line_covid_daily_by_state')
c.type = 'line'
c.title = "Total Cases by Day"
c.xAxis.field = 'date'
c.xAxis.title = "Day"
c.yAxis.field = 'cases_new'
c.yAxis.title = "New Cases"
c.yAxis.minimum = 0
c.line.aggregation = 'sum'
c.line.splitCategory = 'state'
c.line.timeIntervalSize = 1
c.line.timeIntervalUnits = 'DAYS'
c.displaySize = 900, 500
c.dataSource = 'memory/covid_daily'
c.exportToSVG('line_covid_daily_by_state.svg')
Above, you can see that line charts become messy and difficult to interpret when many series are displayed (such charts are sometimes referred to pejoratively as spaghetti plots). New in ArcGIS Pro 2.7, I can display this data in a clearer way by creating a matrix heat chart. Matrix heat charts are used to visualize relationships between categorical or date fields with a grid of shaded cells. Here I want to view each state on the Row axis and each day on the Column axis, and I’ll use the cases_new
field to determine the intensity of the cell shading.
c = arcpy.Chart('mhc_covid_by_state')
c.type = 'matrixHeatchart'
c.title = "Daily COVID Cases by State"
c.xAxis.field = 'date'
c.xAxis.title = 'Day'
c.yAxis.field = ['state', 'cases_new']
c.yAxis.title = 'State'
c.matrixHeatChart.aggregation = 'sum'
c.matrixHeatChart.classificationMethod = 'naturalBreaks'
c.matrixHeatChart.classCount = 7
c.matrixHeatChart.nullPolicy = 'zero'
c.legend.title = "Number of Cases"
c.displaySize = 800, 1200
c.dataSource = 'memory/covid_daily'
c.exportToSVG('mhc_covid_by_state.svg')
You can see that this chart allows for an easier comparison of daily COVID cases between states because each state is displayed as a separate row, whereas the line chart forces all states to compete for the same space.
Conclusion
Charts and ArcGIS Notebooks allow you to visually explore the patterns found in data with just a few simple lines of code. Dig into the Pro Charts and ArcPy Chart documentation to learn more about all the supported chart types and how you can configure them to suit your visualization needs. And keep in mind, making a chart in the ArcGIS Pro UI is just as easy, and also provides interactivity between charts, maps, and tables. I hope that you’ll take advantage of ArcPy Charts and ArcGIS Notebooks in your next automation or spatial data science project!
Article Discussion: