A small problem of generalisation and a bigger problem of topology

Friday, December 16, 2016

I was recently teaching a class on introductory cartography where we were using a range of different socio-economic datasets including 2011 counties and middle super output areas (MSOA) of the UK from the UK Data Service. These are (helpfully) made available in a range of different formats including the ubiquitous shapefile. These are helpful for choropleth mapping of socio-economic (census) data, use as location maps and when clipping other datasets for including topographic data on maps (e.g. Meridian 2).

One student wanted to generalise the polygons for the location map - thinking this would be easy he went ahead and ran the toolbox tool but end up with lots of sliver polygons as a result. Crucially, as a shapefile doesn’t store topological relationships, the tool was generalising each polygon separately resulting in a very poor output. And this was exacerbated by the fact that the borders were provided pre-generalised.

The obvious solution is to use a topological version of the data - which isn’t provided. The next step is therefore to create the topology in ArcGIS before generalising it. And whilst not difficult, it is a little convoluted to achieve! I found this page particularly helpful and it provided the core of processing (and remember, as with all computing instructions, you need to follow it to the letter!) which can be carried out in ArcCatalo. In short, the steps are:

1. Create a new geodatabase (either file or personal)
2. Create a new feature dataset within that
3. Import the shapefile into the feature dataset
4. Create new topology in the feature dataset
4a. For the topology you will need to use two rules: (a) no gaps and (b) no overlap
4b. This will throw an error where you have coastlines because (obviously) you have a gap!
5. At this point you now have built topology for the dataset and you can proceed to simplify/generalise the borders. Note that there will be multipart polygons present and if (like me) you want to delete any small islands to clean up data for use as a location map then you will need to run the “multipart to singlepart” toolbox tool.

This all proved a little more long-winded than I was expecting, but such is the price of topology! That did make me wonder if I could (easily) do this in QGIS and my initial research suggests not. Yes, the latest versions of QGIS have the Topology Checker Plugin (built-in) which checks topology (doh!) but as far as Im aware there is not an open source file format that supports topology. The grown up solution would be to use a PostGIS/PostgreSQL database but this isn’t particularly useful when you want to distribute data. If anyone knows better (or can correct me) then please do get in touch!