Creating GEO-enabled applications with MySQL 5.6

In my previous post I’ve showed some new MySQL 5.6 features which can be very helpful when creating geo-enabled applications. In this post I will show how we can obtain open-source GIS data, convert it to MySQL and use it in our GEO-enabled applications. I will also present at the upcoming Percona Live conference on this topic.

Data sources (US)

For the U.S. we may look at 2 major data sources:

1. ZIP codes with latitude, longitude and zip code boundaries (polygon). This can be downloaded from the U.S. Census website: US Zipcodes direct link

2. Point of interests, roads, boundaries, etc. The Openstreatmap website provides an excellent source of the GIS data. North American data can be downloaded here (updates frequently)

Data formats and conversion

U.S. Census data is stored in Shapefile (.shp, .shx, .dbf) format. Openstreetmap uses its own XML format (OSM)  and/or Protocolbuffer Binary Format. We can convert this to MySQL with GDAL server (on Linux) and ogr2ogr utility. To convert Shapefile any version of GDAL will work, however, for OSM/PBF we will need to use v. 1.10. The easiest way to get the GDAL 1.10 is to use Ubuntu + ubuntugis-unstable repo.

Here are the commands I’ve used to install:

This will install gdal server. Make sure it is latest version and support OSM format:

Now we can convert it to MySQL. First, make sure MySQL has the default storage engine = MyISAM (yes, GDAL will use MyISAM to be able to add a spatial index) and the max_allowed_packet is large enough:

ZIP codes and boundaries conversion

Now we can start conversion:

The only thing we need to specify is db name and user name (assuming it will write to the localhost, otherwise specify the MySQL host). ogr2org will create all needed tables.

The  geometry_columns and spatial_ref_sys are the reference tables only. All zip codes and boundaries will be stored in tl_2013_us_zcta510 table:

Example 1. Selecting zip code boundaries for a given zipcode (Durham, NC):

Example 2. Find ZIP code for the given point (Lat, Lon): Percona HQ in Durham, NC

Converting OpenStreetMap (OSM) data 

Converting OSM is the same:

Please note, that it will take a long time to convert (8-12+ hours, depends upon the hardware).


Points of interest are stored in “points” table. “Lines” and “multilinestrings” tables contain streets, hiking trails, bike paths, etc:

“Shape” is the point (in spatial format) and other_tags will contain some additional format (in JSON format), this is how ogr2ogr converts it by default. See the GDAL documentation on the OSM driver for more information.

OSM data may contain the zip code, but this is not guaranteed.  Here is the example how we can find all coffee shops in ZIP code 27701: