Density analysis by another name

6th December 2011

I was recently writing a geoprocessing model to calculate the density of a point feature class from which all the areas above a specific threshold could be selected. I had been thinking about writing this for a while and had in mind the process I would use, but I soon discovered that I didn’t understand how the Density tools in Spatial Analyst work and would need to find an alternative.

The Density toolset in the Esri Spatial Analyst extension contains three tools: Kernel Density, Line Density and Point Density. I had thought that I would run the Point Density tool and then use the Raster Calculator or even the Contour tool (which would have taken me straight to vector format) to select out the areas above my threshold.

But I hadn’t taken into account the method by which the Point (and Line) Density tool calculates the output cell values. The ArcGIS Desktop 10 help says:

By calculating density, you are in a sense spreading the values (of the input) out over a surface. The magnitude at each sample location (line or point) is distributed throughout the study area, and a density value is calculated for each cell in the output raster.

It was the last part that I hadn’t thought about: a ‘density value’ is calculated for each cell in the output raster. What unit would the density value be in?

Let me give an example. You are analysing population density and want to identify all the areas where the density is greater than 500 people per square kilometre. You open the Point Density tool, choose the neighbourhood, and set the units to square kilometres.

You’ve set the units to square kilometres so the values of the cells in the output raster are ‘number of people per square kilometre’. Right? Well, sort of.

The Point Density tool totals the number of points that fall within a neighbourhood, applies your population weighting if you have chosen one, and then divides this total by the area of the neighbourhood. It then applies a scaling factor according to the area units you selected.

An example is a farm house with a population of 4 and no other houses nearby. The Point Density tool will total the number of points within the neighbourhood (1 farm house), weight it by the population field (4 people) and divide it by the area of the neighbourhood (in the example above a 250m circle, or 196,349.5m2). As the units were set to square kilometres the resulting figure (0.00002037) is multiplied by 1,000,000 (the number of square metres in a square kilometre) giving a cell value of 20.37. But what does that mean?

My head says that logically there are 4 people living in the area, but my density raster gives me a value of 20.37. Now apply this to a city, or a country, and how do I now select out areas above my threshold of 500 people per square kilometre? This was especially confusing as I wasn’t modelling population, I was modelling energy use. I wanted to identify areas where the demand for energy was high. The output units were simply not what I was expecting.

So I went back to the drawing board, or in this case the Desktop online help. I eventually came across the Neighbourhood toolset, containing six tools which I had never used before: Block Statistics, Filter, Focal Flow, Focal Statistics, Line Statistics and Point Statistics.

It was the last one that caught my eye. The help says that the Point Statistics tool calculates statistics on point features that fall in the neighbourhood around each output raster cell. The statistics available include mean, majority, maximum, minimum, standard deviation and most importantly for me, sum. What if I summed the energy demand in a neighbourhood? If I know the area of the neighbourhood is one square kilometre, then I know the output cell values are ‘energy demand per square kilometre’.

I soon realised that this was what I wanted. I was using the energy data to find locations where local demand was high enough to support a Combined Heat and Power (CHP) plant. CHP plants create electricity from fuel and circulate the heat produced through a network of pipes to provide hot water for radiators and taps. To make the most of this efficient process there has to be sufficient local demand for hot water, preferably as close as possible to the source.

From here on in it was easy. I decided to use the Reclassify tool to classify areas above the energy demand threshold as 1 and areas below as NoData, and then the Raster to Polygon tool to convert the areas to vector. This gave me polygons within which the density (or sum!) of energy demand met my threshold and would therefore support a CHP plant.

So, in conclusion, if you are doing density analysis, take a look at the Neighbourhood toolset to see if it could help you. Although not advertised in the Desktop help using the word density, I think it has useful parallels.

Before you start out, think carefully about what it is you want to do. I had thought that I was doing traditional density analysis, but knowing that ultimately I wanted to know the sum of something within a defined area might of helped me get there quicker.

Finally, don’t underestimate the ArcGIS help. After 8 years of specialising in Desktop I still use the help most weeks and always learn something new.