September 19, 2014

Creating GEO-enabled applications with MySQL 5.6

In my previous post I’ve showed some new MySQL 5.6 features which can be very helpful when creating geo-enabled applications. In this post I will show how we can obtain open-source GIS data, convert it to MySQL and use it in our GEO-enabled applications. I will also present at the upcoming Percona Live conference on this topic.

Data sources (US)

For the U.S. we may look at 2 major data sources:

1. ZIP codes with latitude, longitude and zip code boundaries (polygon). This can be downloaded from the U.S. Census website: US Zipcodes direct link

2. Point of interests, roads, boundaries, etc. The Openstreatmap website provides an excellent source of the GIS data. North American data can be downloaded here (updates frequently)

Data formats and conversion

U.S. Census data is stored in Shapefile (.shp, .shx, .dbf) format. Openstreetmap uses its own XML format (OSM)  and/or Protocolbuffer Binary Format. We can convert this to MySQL with GDAL server (on Linux) and ogr2ogr utility. To convert Shapefile any version of GDAL will work, however, for OSM/PBF we will need to use v. 1.10. The easiest way to get the GDAL 1.10 is to use Ubuntu + ubuntugis-unstable repo.

Here are the commands I’ve used to install:

This will install gdal server. Make sure it is latest version and support OSM format:

Now we can convert it to MySQL. First, make sure MySQL has the default storage engine = MyISAM (yes, GDAL will use MyISAM to be able to add a spatial index) and the max_allowed_packet is large enough:

ZIP codes and boundaries conversion

Now we can start conversion:

The only thing we need to specify is db name and user name (assuming it will write to the localhost, otherwise specify the MySQL host). ogr2org will create all needed tables.

The  geometry_columns and spatial_ref_sys are the reference tables only. All zip codes and boundaries will be stored in tl_2013_us_zcta510 table:

Example 1. Selecting zip code boundaries for a given zipcode (Durham, NC):

Example 2. Find ZIP code for the given point (Lat, Lon): Percona HQ in Durham, NC

Converting OpenStreetMap (OSM) data 

Converting OSM is the same:

Please note, that it will take a long time to convert (8-12+ hours, depends upon the hardware).

Tables:

Points of interest are stored in “points” table. “Lines” and “multilinestrings” tables contain streets, hiking trails, bike paths, etc:

“Shape” is the point (in spatial format) and other_tags will contain some additional format (in JSON format), this is how ogr2ogr converts it by default. See the GDAL documentation on the OSM driver for more information.

OSM data may contain the zip code, but this is not guaranteed.  Here is the example how we can find all coffee shops in ZIP code 27701:

First, I have selected the ZIP code boundaries into MySQL variable (I could have used subquery, in MySQL 5.6 the performance will be very similar; this is a little bit outside of the current blog post topic, so I will not compare the 2 methods here).

Second I’ve used this variable to find all point which will fit into our boundaries and filter by “amenity”=>”cafe”. I have to use like ‘%..%’ here, but I’m relying on the spatial index here. Explain plan:

Conclusion

Using open source spatial data is a great way to enrich your application and add new features. You can store this data in MySQL so the application will be able to perform a join to the existing data. For example, if you store ZIP code for a user you can use OpenStreetMap data to show the appropriate content for this user. I will also provide more examples in my upcoming Talk @ Percona Live 2014 as well as share it in this blog in a future post.

I’ve also created a Public Amazon AMI: GIS-MySQL-Ubuntu – ami-ddfdf5b4. The AMI has the ZIP code and OSM data in MySQL 5.6 as well as the GDAL server installed (under /data, mounted on EBS). Please feel free to give it a try. As always I appreciate any comments/questions/thoughts/etc.

About Alexander Rubin

Alexander joined Percona in 2013. Alexander worked with MySQL since 2000 as DBA and Application Developer. Before joining Percona he was doing MySQL consulting as a principal consultant for over 7 years (started with MySQL AB in 2006, then Sun Microsystems and then Oracle). He helped many customers design large, scalable and highly available MySQL systems and optimize MySQL performance. Alexander also helped customers design Big Data stores with Apache Hadoop and related technologies.

Comments

  1. Val Vinder says:

    Thanks Alex,
    Awesome writeup, followed your steps and created our own map server in a matter of min. Works like a charm.

  2. Val, thank you!

  3. hartmut says:

    There’s also ready-to-use OpenStreetMap data (using the osm2pgsql default schema) for almost all european countries at http://www.php-groupies.de/gis-data/

    More background information on these files http://www.skysql.com/blogs/hartmut/real-world-gis-test-data

Speak Your Mind

*