Episode 3: Teaching machines to map

Click here for the episode transcript

Episode: Teaching machines to map
Duration: 0:23:53

START AUDIO

Voiceover: The Spatial Jam, an Esri UK podcast.

Richard: There’s a pretty good chance that you've already used machine learning in GIS without realising it.

Alasdair: Helping demystify things for us.

Beth: Where can I learn? What’s my first step? Welcome to the Spatial Jam. I'm Beth, and today I'm joined by Alasdair.

Beth: Today's Spatial Jam is all about machine learning, so we've invited someone along that knows a thing or two about it, a star of Esri UK’s ‘Technology Showcases’, Richard Mumford. Welcome, Richard.

Richard: Thank you, nice to be here.

Beth: Machine learning is a bit of a hot topic in technology, so we thought it would be good to decode the acronyms and find out more about how and why this is changing GIS. Let's start with the basics: what is machine learning?

Richard: Okay, yes, sure. I think, in terms of understanding machine learning, we probably have to take one step back and think about AI, or artificial intelligence, which is probably another term that you've heard about.

Artificial intelligence is the umbrella term, if you like, which machine learning sits within. It's a field of computer science, if you like. It focuses on making machines that can operate with apparent intelligence, in that they might appear to be able to think and make decisions, if that makes sense.

Within that, we have machine learning, which is what we're going to focus on today. It's a subset of AI, and you might think of it as a possible way of achieving artificial intelligence. I think the title is fairly explanatory in that it's a way of teaching a machine to make decisions based on input data. An example might be image recognition, where you've probably seen this happen: you can give a computer an image, and it will tell you what the image is of.

The way we achieve this is to teach the machine by giving it lots of different examples of the things we want it to be able to detect, until it understands it in a general sense, at which time we can give it data that it has never seen before, and it can tell us what's in that image.

Alasdair: Just on that whole thing around image recognition, Richard, it's becoming more and more prevalent now, but it's something that I came across years ago as a research exercise. I just wondered whether there was anything in particular that had driven the increasing use of it.

Richard: Yes. I think, as with many things in technology, processing power is driving a lot of these things. You've got twofold: you've got so much more processing power than you would have had at your fingertips back then, and then at the same time the techniques and algorithms are getting more advanced to harness that power.

Essentially, in order to make something widely adopted, it has to be – has to end up – fairly transparent that anybody can use it. I think that's where we're getting to now. So, I think that's the main reason.

Alasdair: Just while you're helping demystify things for us, I, kind of, thought I'd got my head around what machine learning was, and then I came across something the other day that was talking about deep learning. It seemed to be being described as something different to machine learning, or a subset of machine learning. At that point, I went, “Oh, no, another term. (Laughter) I thought I understood what this stuff was about.”

Richard: Yes. No, it's a minefield. (Laughter) If we think back to machine learning being a subset of artificial intelligence, deep learning is a subset of machine learning, so it's another level down from there. It's essentially designed to overcome some of the limitations that we face with traditional machine learning.

If we think back to our image recognition example again, let's say we have taught the machine how to identify a bicycle, right? We can give it an image and it'll say, ‘Yes, it's a bike,’ ‘No, it's not.’

What machine learning would struggle to do is, if we give it an image of just a whole city scene, with buildings, and people, and cars and bikes, it won't really be able to deal with that, because there's too much information going on. So, that's where we might use something like machine learning to not only be able to recognise a bike but also pick it out from a scene of other things and tell us, at the same time, all the other things that are there.

The way it does that is it uses an algorithm which, kind of, mimics the way that a human brain works. A human brain is a massive collection of neurons that all work together to become bigger than the sum of their parts, essentially. That’s how deep learning works. It creates what's called an ‘artificial neural network’, to solve more complex problems.

Alasdair: That concept of a neural network, that’s something else that I've come across before, and that is about that idea that you just described, of having a group of machines all working collectively as a linked system rather than as either an isolated or as a linear system. Is that roughly what it is?

Richard: Yes, not necessarily a collection of machines that can all take place on the same machine, but a collection of neurons that are all, by themselves, doing quite a simple thing, but interacting with each other to come to a much more complex decision-making system, if you like. (Laughter)

Alasdair: For me, that kind of idea, I have a sense that the interaction creates something new, but I also find it quite an intangible thing to get, because once you start trying to break it down you've almost lost the thing that you're trying to understand.

Richard: Yes, absolutely.

Alasdair: Our reductionist approach to understanding things doesn't really work.

Richard: No, it doesn't. To me, that's an interesting point because, even when you're working with neural networks and you're training them up, you don't really know what's going on in that process. You're training it. You're testing the answers it gives you, to make sure that they're getting better and better as you go along, but nobody could actually tell you what's going on in any individual part of that, any individual neuron, which is an interesting point in itself because we're ultimately responsible for these tools, without truly understanding what's going on inside them, which is interesting to me.

Beth: Maybe let's bring it back to something that we all understand, and bring it back to GIS. How is it actually affecting GIS and the work that people are doing at the moment?

Richard: That's a great question. I think it's already starting to quite profoundly impact GIS. There's a pretty good chance that you've already used machine learning in GIS without realising it. Increasingly, lots of our tools are built upon machine learning, to make them more efficient and to give better answers. So, yes, you may have already used it without really knowing it, but at the same time you can absolutely use it more explicitly in GIS.

I think one of the most common things we're seeing through is getting extra information or creating data products out of imagery, whether it's satellite imagery or aerial imagery. A couple of examples might be generating building polygons just from imagery, or also generating land-use rasters, again just from plain imagery, are a couple of ways that we're seeing it used.

Beth: I've done some land classification in the past, so it sounds like I'm already an expert on machine learning, so this is great. I can take that forward. (Laughter) For our listeners out there, what level of user do you actually need to be to get started on this? Do you need to be a developer? Can you just be a GIS beginner? Where do you need to sit to get the most out of this?

Richard: Yes, I think we're at the point where you can make it as simple or as complex as you want or need to. As I said, there are things built into tools already that anybody can use without having to understand it. That’s great. For me, that's the point that we want to get to, where it's usable by anyone, but we're not quite there yet.

You can also take one step up on that. If you were to look in the Living Atlas, you would find a collection of pre-trained deep learning models. For these, you wouldn't really need to know that much about machine learning at all, because they're pre-trained. They're already built to do a job.

You essentially bring them into something like ArcGIS Pro or our Python API, using Notebooks or something, and they will just function. One example is land classification, so you give it some imagery and it will give you land classification out the other end. So, there are options to get started, with really very little knowledge at all.

You could also take one step up and have a look at ‘ArcGIS.learn’, which is a module in our Python API. That's, kind of, a helper module to help you train your own models, so at that point you would need to know more about the process and how it all works.

You could also start from scratch and build something completely from nothing that's custom, at which point you would need to be, essentially, an expert. I think there's something for everyone, which is a really nice position to be at.

Alasdair: I guess just on that last bit, Richard – and you touched on being able to access pre-trained models – just in fairly simple terms, what are the processes in putting together a machine learning module?

You mentioned a little bit about developing the model, and the need to train it, and then be able to use it and interpret results. Maybe you could just walk us through in broad terms what those steps are in terms of if you were to start from scratch, either because that's what people are looking to do, but I think also it's quite nice to understand what somebody has already done for you if you’re pulling in one of these pre-trained models.

Richard: Yes, definitely. If we were to take extracting building polygons as an example, what you would look to do is generate a training dataset, which would be lots of examples of different buildings across the world if you want to do it globally. You could also focus on a particular area. You might get more accuracy that way, but, either way, you would feed the model with lots of isolated images of buildings.

That's how it learns what a building looks like, at which point you regularly test it against another dataset so you can confirm how accurate it is and gradually increase its accuracy that way, until you reach the best kind of settings, if you like, for that model, at which point it's good to go. You can either use it in isolation or you could bring it into one of our other products to use it as part of a bigger process if you wanted to.

Alasdair: That training process, is that something that typically involves a similar amount of effort for each model, or is it always going to depend on a model-by-model basis as to how much training something needs?

Richard: It will depend to an extent, but lots of applications of machine learning are based upon similar types of machine learning – for example, object detection – at which point there is definitely similarity between different use cases. But it will vary, depending on the complexity of what you're trying to achieve.

Alasdair: Cool. I guess one of the other things that you touched on there that I think is quite interesting and is quite relevant in terms of its use in GIS was you were talking about, maybe, you need to consider which area in the world you're basing your training examples on, because if…

I think this is true for some of the Living Atlas models, that they say in the description, ‘This is for North America. This is for somewhere like Africa,’ and that you need to be conscious of how different things can look in different places. If it's something like a building, our buildings look different to buildings in the US, as well, so maybe a model trained in the US would perform less well in the UK. Is that the kind of background to that distinction?

Richard: Yes. I guess the thing to remember is that it's easy to be fooled into thinking the machine is actually intelligent. (Laughter) It's not. It's, when you're starting off, it knows nothing at all, so everything that it knows is because you've shown it and you've taught it that. So, there's definitely the element, as you say, of houses looking different across the world.

You also have to be careful with things like, I think, there's a famous example where someone trained an object-detection model to try and detect tanks from satellite imagery, or different types of tanks. One they did in winter time, when it was snowing. The other one they did in summer, so essentially what they taught the model to do was tell it whether it was snowing or not. I guess that's another example of the machine isn't clever in reality. So, when you're building models, you have to be very mindful of what you're showing it and what that could mean.

Again, there are examples of this actually having ethical implications, as well, where machine-learning models have been trained with a certain unconscious bias built into them. That's again something we need to be really conscious of when we're building these tools.

Alasdair: I guess just to think about examples of that, I came across a piece that Ordnance Survey had shared about some work that they're doing to map,I think it's Lusaka, in Africa, a city in Africa.

One of the focuses for that is on, essentially, shantytowns, so I think informal dwellings. If you imagine something like a shantytown, that becomes quite a different pattern of building. Again, I guess that's something that you've then got to be mindful of when, if you're looking to use it in those kinds of examples, you've got to be aware of that as part of the training.

Bringing it back to some of the opportunities and benefits that machine learning brings, I guess that's maybe quite a good example in terms of being able to extract information from imagery, which is relatively easy to obtain, versus traditional mapping methods. That's a very time-intensive process, whereas extracting information from imagery allows you to start mapping areas that have, maybe, not had that opportunity to be mapped in the same level of detail.

Beth: I think that's a really good point, actually, because recently there has been a new addition to the Living Atlas where they've put in a global land-cover model which is more detailed than any that we've released in the past. That’s all based on machine learning. Previously, that would have been so much more difficult to do because a) we wouldn't have had the imagery available, but we just didn't have the capacity to create a global dataset in that way.

Just going back to your point about making sure it's correct for the right regions, they've used different bioregions, as they call them, to make sure that they're not identifying things incorrectly in different areas. So, I can definitely see there's a lot of potential use cases for this.

Also, just normal GIS users that aren't interested in trying out machine learning for themselves are already benefiting from it, which is really good. You can just go and use these services that people are creating for you, so that's really great.

Alasdair: Yes. I think if people are looking to actually have a go at making use of this – maybe doing some feature extraction from some imagery or something – in terms of the approach within ArcGIS and the technology, where is a good place to start? Do you have to go to Python, or can you do it through something like ArcGIS Pro?

Richard: Yes, it's pretty flexible, really. I think a great place to start is the pre-trained models that we mentioned earlier. The good thing about those is you can run them in ArcGIS Pro. There's a GP tool in there, which one of the parameters of it is one of these deep-learning models. Once you've input that, it behaves just like a normal GP tool. You give it the datasets and a few other bits of configuration, and it goes off and works just like any other GP tool in Pro.

You can also use them within the context of Python if you want. That might help you to integrate it with other systems if that's what you want to do, or perhaps have a little bit more control, at which point you could use ArcGIS Notebooks, which is a really great thing to use Python in. Particularly if you're fairly new to Python, because it's very visual and helps you understand the code, step by step.

Of course, you can also just use it in pure Python, but I would say they're broadly the two most popular options, would be to run it in our desktop tools or in Python.

Alasdair: In terms of, I suppose, behind the scenes, if you're hooking into it through something like Python, are you then starting to use different technology for the actual machine-learning bit, or is it the same technology that you would be using if you did it through ArcGIS Pro? Are they using some additional libraries? Is there anything else that you need to have set up before you get started?

Richard: Yes, so I guess one thing to understand is that the actual machine learning part of these tools is not done in ArcGIS technology. There are a few key machine learning frameworks out there, one of which being TensorFlow, which some of our tools are built on.

In essence, we're bringing the machine learning into the context of GIS and integrating with it, making it work seamlessly in that environment, but ultimately the machine learning is done in something like TensorFlow. So, if you were to do it in Python, you would need to install a few extra bits and pieces to do that, but at the same time that's pretty cool because it means we can bring industry-standard tools into GIS.

We've talked a lot about imagery and object detection, but there are also other things that we can be inspired from wider industry – for example, natural language processing. If you think about voice assistant on your phone or on your smart speaker, that's using deep learning, too, It's quite remarkable when you think about it.It not only is interpreting sounds into language, it then has to interpret your intent with that and then turn it into discrete actions.

In many ways, I think that has inspired natural language processing in GIS, where we can set a machine to look through lots of text data, for example, and pull-out spatial information from that, whether it's addresses, postcodes, coordinates, or anything else. I think the fact that we're using industry-standard tools almost gives us a hook into things that we might not have thought about before.

Alasdair: The ability to map conversations is going to take away my need to keep looking on a map to see where people are talking about when they're describing their holidays and saying where they've been. (Laughter)

Richard: Exactly.

Alasdair: We’ll just naturally do it as the conversation flows, and little dots will appear on the map.

Beth: Okay, so if I wanted to go and get started right now, completely at the basic level, where can I go to find some resources? Where can I learn? What's my first step?

Richard: Yes, so there are all sorts of resources out there, increasingly so. We're bringing out more and more. There are things like Esri Academy and Learn GIS, where there are some specific resources there.

If you're taking the Python route into it, there are also some really great sample Notebooks on our developer’s website that really take you step by step through the process of doing these things. The good thing about those, again, because they're in Notebooks, you get text explaining it, embedded code, and everything right there that you need to understand it.

Then, beyond that, I personally look to some of the demos that either we or Esri Inc do at conferences, where I use those for inspiration. Then usually you can get the samples available afterwards, so that's another good place to go and see the cutting edge of where we're going.

Alasdair: You mentioned that things keep getting added. There's a fairly new example has gone onto Learn GIS just recently. It was about street signs. I think one of the things that I thought was quite neat about that was it was linking it with Survey123. The idea is that you can go out, use Survey123 to capture photos of the street signs, and it will then use machine learning to extract the information from the sign, I think is what it's doing.

I guess the key thing with it being GIS is about collecting that information from the image but also adding that spatial element, as well, and an example of asset extraction or asset mapping, I suppose, as a slightly different take on using machine learning.

There was a blog a while back, came out, I think it was the Bavarian Highways Agency had used it. They were monitoring road surfaces and identifying cracks. That's a different take on it, as well. It's not even asset mapping. It's, I suppose, asset failure mapping, (Laughter) but being done using machine learning, an interesting topic.

Richard: Yes. When you think that anything that has a location associated to it, and perhaps a camera available, you can do those sorts of things. Whether it's monitoring rail corridors by just having a simple camera on the front of every train, and a GPS receiver, all those kinds of things are possible.

Also, even if it's a static camera, we're starting to be able to map locations within the field of view. So, if you have a wide-angle camera that's looking into the distance, you can start to figure out where in that field of view it is spatially. There are all sorts of avenues that are cropping up that are going to be really promising for future use cases.

Alasdair: I guess the more things are automated, the more something like putting a camera on the front of a train, that becomes useful because you could automate the information extraction so that, instead of just generating lots of video that you'd have to find the time or the person to sit and monitor it, it becomes something that you can actually start extracting information from it automatically and then making use of that capability in a different way.

Richard: Yes, and you wouldn't even need to store the video. That could just be a stream going by and being deleted straightaway. So, you're saving a ton of data storage and having to push data around in real time because you just get the information that, “There's a tree that's overhanging here that wasn't yesterday. We need to send someone out to have a look at it,” kind of thing.

Beth: It sounds like there are so many different potential use cases for this. They’re just going through my mind, from police forces using it, to transport, as you talk about, looking after roads and everything. So, I think this would be a really interesting topic to come back to in two years’ time, see how much it has changed, and what people are now doing, and what we're looking to, to the future, as well.

On that note, thanks to everyone for listening. If you'd like to know more about any of the things we've talked about, please, visit our website at ‘esriuk.com’, where you'll find our product pages and our Tech Blog. Or email us at ‘podcast@esriuk.com’. If you've enjoyed this, please, give us a like or stars on your podcast platform. We hope you'll join us again for another Spatial Jam.

Voiceover: The views of the presenters may differ from those of Esri UK

Beth: Then no squeaky chairs this time, because Sam is not here.

Richard: That's the mic drop.

END AUDIO

Related Content