Geo-enabled (or location-enabled) applications are very common nowadays and many of them use MySQL spatial functions. The common tasks for such applications are:

- Find all points of interests (i.e. coffee shops) around (i.e. a 10-mile radius) the given location (latitude and longitude). For example, we want to show this to a user of the mobile application when we know his/her approximate location. (This usually means we need to calculate a distance between 2 points on Earth).
- Find a ZIP code (U.S. Postal address) for the given location or determine if this location is within the given area. Another example is to find a school district for the given property.

MySQL spatial functions were originally included (implementation follows a subset of OpenGIS standard). However, there are 2 major limitations of MySQL spatial functions that can make it difficult to use those functions in geo-enabled applications:

The “distance” function was not implemented before MySQL 5.6. In addition (even in MySQL 5.6), all calculations (e.g. distance between 2 points) are done using a planar coordinate system (Euclidean geometry). For the distance between 2 points on Earth, this can produce incorrect results.**Distance between 2 points.**. Before MySQL 5.6 the functions that test the spatial relationships between 2 geometries (i.e. find if the given point is within a polygon) only used a Minimum Bounding Rectangle (MBR). This is a major limitation for example #2 above (I will explain it below).**Determine if the point is inside a polygon**

In my old presentation for the 2006 MySQL User Conference I showed how to calculate distances on Earth in MySQL without using the MySQL spatial functions. In short, one can store the latitude and longitude coordinates directly in MySQL fields (decimal) and use a haversine formula to calculate distance.

## New MySQL Spatial Functions in 5.6

*The good news is:*

1) MySQL 5.6 adds a set of new functions (some of them are not 100% documented though) that use the object shapes rather than the MBR to calculate spatial relationships. Those new functions begin with “ST_”, i.e.

- contains(g1, g2) uses MBR only (not exact!)
- st_contains(g1, g2) uses exact shapes

2) MySQL 5.6 implements st_distance(g1, g2) function that calculates the distance between 2 geometries, which is currently not documented (I’ve filed the feature request to document the st_distance function in MySQL)

*The bad news is:*

1) All functions still only use the planar system coordinates. Different SRIDs are not supported.

2) Spatial indexes (RTREE) are only supported for MyISAM tables. One can use the functions for InnoDB tables, but it will not use spatial keys.

**Example of MySQL’s MBR “false positives”**

To illustrate why we do not want to use MBR-based functions for geospatial search, I’ve generated 2 polygons that represent 2 zip code boundaries in San Francisco, CA and placed it on Google Maps.

The blue rectangle represents the Minimum Bounding Rectangle of Zip code “91102” (I’ve used envelope() mysql function to obtain coordinates for the MBR). As we can see it covers both zip code 94103 * and* 94102. In this case if we have coordinates of a building in the city’s “south of market” district (ZIP 91103) and try to find a zip code it belongs to using the “contains()” function we will have a “false positives”:

1 2 3 4 5 6 7 8 9 |
mysql> select zip from postalcodes where contains(geom, point(-122.409153, 37.77765)); +-------+ | zip | +-------+ | 94102 | | 94103 | | 94158 | +-------+ 3 rows in set (0.00 sec) |

In this particular example we got 3 zip codes as the MBR of 94158 also overlaps this area. Another point in “south of market” can actually produce 4 different zip codes. However, in MySQL 5.6 we can use the new st_contains function:

1 2 3 4 5 6 7 |
mysql> select zip from postalcodes where st_contains(geom, point(-122.409153, 37.77765)); +-------+ | zip | +-------+ | 94103 | +-------+ 1 row in set (0.00 sec) |

As we can see st_contains() produces the correct results.

**Find a ZIP code for the given location**

Starting with MySQL 5.6 one can use the MySQL spatial functions st_contains or st_within to find if the given point is inside the given polygon. In our scenario, we will need to find the zip code for the given latitude and longitude. To do that in MySQL we can perform the following steps:

- Load the zip code boundaries into MySQL as a multipoligon. There are a number of ways to get this done, one way is to download the shape files from the Census website and convert them to MySQL using org2org utility. (I will describe this in more detail in upcoming blog posts). The data will be stored as a MySQL Geometry object, to convert it to text we can use astext(geom) function.
- Use the st_contains() or st_within() functions:

1234567mysql> select zip from postalcodes where st_contains(geom, point(-122.409153, 37.77765));+-------+| zip |+-------+| 94103 |+-------+1 row in set (0.00 sec)

or

1234567mysql> select zip from postalcodes where st_within(point(-122.409153, 37.77765), geom);+-------+| zip |+-------+| 94103 |+-------+1 row in set (0.00 sec)

*Spatial Index for “ST_” functions*

MyISAM tables support Spatial indexes, so the above queries will use those indexes. Example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
mysql> alter table postalcodes add spatial index zip_boundaries_spatial (geom); Query OK, 35679 rows affected (5.30 sec) Records: 35679 Duplicates: 0 Warnings: 0 mysql> explain select zip from postalcodes where st_contains(geom, point(-122.409153, 37.77765))G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: postalcodes type: range possible_keys: zip_boundaries_spatial key: zip_boundaries_spatial key_len: 34 ref: NULL rows: 1 Extra: Using where 1 row in set (0.01 sec) |

As we can see our spatial index is used for those functions. If we ignore or remove the index, the query will run significantly slower:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
mysql> select zip from postalcodes where st_within(point(-122.409153, 37.77765), geom); +-------+ | zip | +-------+ | 94103 | +-------+ 1 row in set (0.00 sec) mysql> select zip from postalcodes ignore index (zip_boundaries_spatial) where st_contains(geom, point(-122.409153, 37.77765)); +-------+ | zip | +-------+ | 94103 | +-------+ 1 row in set (4.24 sec) |

The InnoDB engine does not support spatial indexes, so those queries will be slow. As zip boundaries do not change often we can potentially use MyISAM tables for them.

*Find all coffee shops in a 10-mile radius*

MySQL 5.6 supports st_distance functions with 2 drawbacks:

- It only supports planar coordinates
- It does not use index

Given those major limitations, it is not very easy to use st_distance function for the geo enabled applications. If we simply need to find a distance between 2 points it is easier to store lat, lon directly and use harvesine expression (as described above).

However, it is still possible to use the st_distance() if we do not need exact numbers for the distance between 2 points (i.e. we only need to sort by distance). In our example, to find all the coffee shops we will need to:

- Get the 10 mile radius MBR and use “within()” or “st_within()” function
- Use st_distance function in the order by clause

First, we will calculate an envelope (square) to include approximately 10 miles, using the following approximations:

- 1 degree of latitude ~= 69 miles
- 1 degree of longitude ~= cos(latitude)*69 miles

1 2 3 4 5 6 7 |
set @lat= 37.615223; set @lon = -122.389979; set @dist = 10; set @rlon1 = @lon-@dist/abs(cos(radians(@lat))*69); set @rlon2 = @lon+@dist/abs(cos(radians(@lat))*69); set @rlat1 = @lat-(@dist/69); set @rlat2 = @lat+(@dist/69); |

*@lat and @lon in this example are the coordinates for the San Francisco International Airport (SFO).*

This will give us a set of coordinates (points) for the lower left and upper right corner of our square. Then we can use a MySQL’s envelope function to generate the MBR (we use linestring to draw a line between the 2 generated points and then envelope to draw a square):

1 |
select astext(envelope(linestring(point(@rlon1, @rlat1), point(@rlon2, @rlat2)))); |

The “envelope” will look like this:

This is not exactly a 10-mile radius, however, it may be close enough. Now we can find all points around SFO airport and sort by distance.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
mysql> select astext(shape), name from waypoints where st_within(shape, envelope(linestring(point(@rlon1, @rlat1), point(@rlon2, @rlat2)))) order by st_distance(point(@lon, @lat), shape) limit 10; +--------------------------------+-------------------------------+ | astext(shape) | name | +--------------------------------+-------------------------------+ | POINT(-122.3890954 37.6145378) | Tram stop:Terminal A | | POINT(-122.3899 37.6165902) | Tram stop:Terminal G | | POINT(-122.3883973 37.6150806) | Fast Food Restaurant | | POINT(-122.388929 37.6164584) | Restaurant:Ebisu | | POINT(-122.3885347 37.6138365) | Fast Food Restaurant:Firewood | | POINT(-122.38893 37.6132399) | Cafe:Amoura Café | | POINT(-122.3894594 37.6129537) | Currency exchange | | POINT(-122.39197849 37.614026) | Parking:Garage A | | POINT(-122.3919031 37.6138567) | Tram stop:Garage A | | POINT(-122.389176 37.612886) | Public telephone | +--------------------------------+-------------------------------+ 10 rows in set (0.02 sec) mysql> explain select astext(shape), name from waypoints where st_within(shape, envelope(linestring(point(@rlon1, @rlat1), point(@rlon2, @rlat2)))) order by st_distance(point(@lon, @lat), shape) limit 10G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: waypoints type: range |