How should I interpret the output of density tools?

By Eric Krause

Many Spatial Analyst users are comfortable using the Density tools and are satisfied with the results they give, but sometimes there is confusion about what the output cell values mean and how they should be interpreted. This blog will help explain how you should interpret the output from the density tools.

The interpretation depends on the purpose

One of the most common uses of the Kernel Density and Point Density tools is to smooth out the information represented by a collection of points in a way that is more visually pleasing and understandable; it is often easier to look at a raster with a stretched color ramp than it is to look at blobs of points, especially when the points cover up large areas of the map. When the density tools are run for this purpose, care should be taken when interpreting the actual density value of any particular cell. Rather than a literal interpretation, the interpretation should be qualitative or relative, something like, “Darker colored cells have more points around them than lighter colored cells,” or, “This cell has twice as many points near it than this other cell.”

A literal interpretation of the output cell values is appropriate when the locations of the points are, in some sense, random. For example, in the case of crime data, you might imagine that the likelihood of crime depends on some underlying socioeconomic and demographic factors. From that perspective, crimes are random and can occur anywhere, but they are more likely to occur in areas where the “risk” for crime is high. The actual locations of crimes are random, but we can attempt to quantify how likely it is that a crime will occur at any particular location. The purpose of the density tools is to attempt to construct a surface that accurately reflects the likelihood of an event occurring in each cell.

For example, suppose you have crime locations for a particular city over the course of one month. In that case, the result of a density analysis could be interpreted as a predictive risk surface for crime in the following month, if we assume that the underlying factors that contribute to crime will not change. It can also be interpreted as a mathematical representation of the underlying factors that contribute to crime, similar to how a histogram is a representation of the underlying distribution of a dataset.

Note: For those familiar with statistical theory, the density surface is a bivariate probability density function of the x- and y-coordinates, multiplied by the number of points. The points you see can be considered random samples from that probability distribution.

Converting density to expected count often provides an easier interpretation

Even when it makes sense to interpret the cell values, it is often much more meaningful to report the expected count within a cell than it is to report the cell density value. Density is count divided by area, so multiplying the density by area will give an expected count. Using the example of crimes in a city over one month, suppose we use an output cell size of 0.5 kilometers, and we see that one cell receives a value of 16 crimes per square kilometer. Since the area of the cell is 0.25 square kilometers (0.5 km * 0.5 km = 0.25 km²), we can multiply the density by the area to get an expected count of 4 crimes (16 crimes/km² * 0.25 km² = 4 crimes). This means that if crime conditions do not change month-to-month, we expect to see about 4 crimes in that cell the following month. If another cell has an expected count of 0.5 crimes, the interpretation would be that if the conditions for crime do not change, we expect about one crime in that cell every two months.

Understanding the area unit

There is also some confusion about how changing the area unit affects the output. The simple answer is that they change the cell values by a constant scalar. For example, if you run kernel density with output units of square meters and run it again on the same data with square kilometers, the cell values in square kilometers will be exactly 1 million times larger than the cells in square meters. This is because there are 1 million square meters in a square kilometer. The most common reason to change the unit is to keep the numbers manageable. For example, it is much more understandable to report the result as 13 crimes per square kilometer instead of 0.000013 crimes per square meter, even though these two statements mean exactly the same thing.

Summary

Hopefully this blog has helped you understand when and how to interpret the output cell values of a density analysis. A key idea to remember is that unless there is some concept of randomness in the point locations, a literal interpretation is not appropriate. Also, converting density to expected count usually gives a more meaningful interpretation of the cell values.

Eric Krause

Eric Krause is a Product Engineer on the Spatial Statistics and Geostatistical Analyst teams. He has worked at Esri since 2010 and specializes in geostatistical interpolation, spatial statistics, and general spatial analysis.

Article Discussion:

0 Comments

Oldest

Newest

Inline Feedbacks

View all comments