Oct 142011

I was skimming through my Twitter stream this morning and came across a tweet from the intrepid Michael Gould (@michael_d_gould) mentioning David J. Unwin’s digital workbook “Numbers aren’t nasty: a workbook of spatial concepts“.  I’m a big fan of David Unwin’s Geographic Information Analysis (co-authored with David O’Sullivan), so I downloaded the workbook (it’s free) and the accompanying data sets.  I was intrigued to see that this included coordinate data for John Snow‘s map of cholera deaths.  Virtually every GIS student learns about the pioneering epidemiological work John Snow did using spatial analysis of cholera deaths, tracing them to the infamous Broad Street pump.  I thought I would be clever and quickly map them using Google Fusion Tables.  What I soon realized was that the coordinates were created using an arbitrary system that placed them somewhere in Africa and, as is often the case, realized that I needed to slow down, take a closer look at the data and what I was doing, and see what was going on.

First, I did some quick online searching, and was surprised that I wasn’t able to find a georeferenced version of the data.  So I went back to the data at hand.  In Unwin’s workbook, he states that the points were originally “digitized at the request of Professor Waldo Tobler (UCSB) by Rusty Dodson of the US National Center for Geographic Information Analysis from a reprint of Snow’s book On Cholera (Oxford University Press, London)”. Since the original data had an arbitrary coordinate system, I used ArcGIS 10 to georeference an image of the map using the Bing Maps hybrid base map, and then spatially adjusted the points (both deaths and pump locations) to match the image.  I then used the ArcGIS Online topographic base map to create the following figure:

Locations of water pumps and cholera deaths

Locations of water pumps and cholera deaths from John Snow's map (the Broad Street pump is the blue symbol at the center of the map)


As this is based on a sketch map scanned from a book, all locations should be treated as approximate.

I must admit that I have sometimes neglected to mention John Snow and his work in my introductory GIS course (for shame!), so now I have some actual GIS data and a modern map to show in class.  It may also turn into a good opportunity to introduce web mapping as well.  After I saw Michael Gould’s original tweet, he and I discussed how we had both been meaning to find the replica Broad Street pump in London (tip: the street is now called Broadwick).  Naturally, Mike tweeted a link to an ArcGIS Explorer map of pump locations which of course inspired me to put my version of the pump and death locations on there as well:

I hadn’t used ArcGIS Explorer much, and intend to incorporate it into my courses, so this was a good excuse to try it out.  Once I got it on the web, I did another search and found that someone else had already put a similar version online – oh well!  At least I learned a lot by going through this exercise, and what else was I going to do on a Friday afternoon?

I should also mention that I came across a simple but very interesting example of spatial analysis of the data done by what appears to be a student named John Mack.  His web page inspired me to start fooling around with the kernel density tool, and I came up with a quick example:

Density of cholera deaths from John Snow's map

Density of cholera deaths using a 100 m kernel density function

I will be the first to say I made this as a quick example, and would not put too much faith in it.  However, I may spend more time on this later, as it might be a good data set for illustrating density analysis.

So, there you have it.  Out of one tweet I read this morning came an entire day’s activity and some data and figures I can use in one of my GIS courses.  Twitter can eat up a lot of time, but sometimes I come across little gems that can be really interesting and useful.

Update: I have created a map data layer package of the pump and death locations that can be downloaded from ArcGIS.com.

Update: I have created zipped shapefile and KML versions of the files as well, both in GCS (WGS84).  In both versions, there are three files: one file contains one point location of the Broad Street Pump, one files includes all of the pumps in the original map (including the Broad Street pump) and one file contains the cholera deaths recorded on the map.