Discovering GeoAnalytics (or what is GeoAnalytics, and why should I care?)

11th January 2017

If you’ve read Alasdair’s introduction to 10.5 you will have already heard of GeoAnalytics, one of the new server roles now available with ArcGIS Enterprise. We’ve seen some hugely impressive (and impressively huge) demos at recent events, including the Esri User Conference in 2016. Perhaps you’re still wondering exactly what GeoAnalytics is, how it works and how you can use it in your organisation. 

What is it? 

GeoAnalytics provides faster and more efficient ways of analysing big vector datasets, and enables you to scale your computing power to match the size of the job at hand by adding more machines to your site (a.k.a. scaling horizontally). 

This could simply mean that a big job will be completed more quickly. However, this reduced processing time can create new opportunities. For example, it might enable you to include your analysis in decisions that have to be made repeatedly on a regular basis (e.g. ‘what are our optimal field deployment patterns for tomorrow?’).  

GeoAnalytics creates usable input for traditional geoprocessing tools from massive datasets, when necessary, through spatial and temporal aggregation. So you can turn your deep knowledge of spatial analysis and geoprocessing techniques to ever-larger datasets by using an intermediate, aggregated step. 

The raw power of these new tools can make it is easy to overlook some features. GeoAnalytics introduces the ability to register an important new data source for direct use in the analysis tools – Big Data File Shares (BDFS).  BDFS can be huge local directories of .csv or shapefiles, as well as Hadoop Distributed File Shares (HDFS) or Hive metadata stores. These new options can help to unlock data from other systems and bring it into your GIS more easily or for the first time.

How does it work? 

GeoAnalytics builds on the Apache Spark framework to distribute analysis processes across multiple machines. This means that the analysis of a single dataset can be split across a cluster of several servers. These do not need to be extraordinarily powerful machines. Generally speaking, big data tools run on large numbers of off-the-shelf machines and are designed to automatically detect and tolerate hardware failure when necessary. GeoAnalytics (and the Big Data Store) are no exception to this. 

You may have petabytes of data that you are raring to analyse, but I’m sure that many readers will be working with more modestly sized datasets. After all, the complex nature of spatial analysis means that gigabytes, or even megabytes of data may still present a considerable computational problem. Fortunately, optimisations in the algorithms that underpin GeoAnalytics mean that you will see considerable performance gains on any reasonably large datasets, even on a single machine. Take a look at the performance stats in the picture below and note the increase in speed between 1 node (traditional) and 1 node (Big Data): 

  Euan Cameron presenting GeoAnalytics performance statistics at the Dev Summit in Berlin, December 2016. Photo  @wingerd . 

Euan Cameron presenting GeoAnalytics performance statistics at the Dev Summit in Berlin, December 2016. Photo @wingerd . 

How can I use it? 

In order to take advantage of these GeoAnalytics tools you will need the Base ArcGIS Enterprise deployment and the Spatiotemporal Big Data Store. Once you license the GeoAnalytics server role its tools are exposed through; the Portal map viewer, ArcGIS Pro, the Python API and via a REST endpoint in the server System directory.   

For the end user, there is no difference between using the GeoAnalytics tools and the standard analysis tools. The interface is the same and the tools don’t demand any detailed understanding of what’s going on under the hood. There are some slight differences (such as the addition of time-slicing), but the main thing you will notice is that they are much, much faster.   

For administrators, the setup process will be familiar to anyone who’s deployed or managed Portal in the Web GIS pattern. There is extensive documentation on the setup process, as well as some bells and whistles that you can use to fine-tune your deployment for optimum performance.  

GeoAnalytics should open up a lot of new possibilities for visualisation and analysis of your data. Arrange a trial, leave a comment with your ideas, or feel free to get in touch with any questions.