This post was originally published in 2020 and has been updated in 2023.
Mastering Geographical Information Systems, better known simply as GIS, can be considered in some ways as a rite of passage. The complexities and challenges involved in learning, which are ostensibly non-IT concepts, are steep. However, as they say, “There’s more than one way to skin a cat.” I’d like to share with you one way to tackle this challenge.
Let me introduce you to PostGIS.
PostGIS is a PostgreSQL extension that adds GIS capabilities to this RDBMS. Its popularity stems not only from being “free” but because it’s considered to be among the leading GIS implementations in the world today. Virtually every major front-end application provides the hooks for a PostGIS, PostgreSQL-enabled back-end.
The PostGIS project, which is BSD licensed, began back in 2001. It turns our vanilla-flavored postgres into a spatial database and includes spatial datatypes (geometry, geography), spatial indexes (r-tree, quad-tree, kd-tree), and spatial functions.
Try Now: Get Percona Support for PostgreSQL
Working with GIS normally requires several layers of technology of Geo-Spatial Software, for example:
Boundless Server (formerly the OpenGeo Suite)
Note: Boundless Server GitHub repository
PostGIS offers a solution to provide geospatial data capabilities in PostgreSQL for indexing, storing, and querying geographical data.
Spatial indexing accelerates the querying and retrieval of geospatial data in PostgreSQL, improving response times for applications and analytics. It uses various spatial indexing techniques, such as R-tree and GiST (Generalized Search Tree), to organize data based on their spatial relationships, allowing the database to quickly narrow down the search space.
PostGIS offers a wide array of geospatial functions for PostgreSQL, allowing users to perform complex spatial operations such as distance and area computations, geometric alterations, and buffering. Optimized spatial queries, like finding nearby points or intersecting polygons, use these functions to retrieve geospatial data efficiently.
PostGIS supports spatial data types like Point, LineString, and Polygon in 2D and 3D. These allow efficient spatial data storage and retrieval due to their optimized indexing mechanisms. They enable spatial indexing, enabling faster query processing by organizing data based on geometric properties.
It’s amazing the kinds of answers one can get by a single, well-composed, and yet standard query by asking such simple questions as:
Here are a few example queries using a PostGIS-powered database. Refer here for more community-related shapefiles for Kamloops British Columbia:
Learn More: Find out how to improve PostgreSQL query performance insights with pg_stat_monitor
|
1 |
--<br>-- EXAMPLE 1: What is the total length of all roads in British Columbia?<br>--<br>SELECT sum(ST_Length(the_geom))/1000 AS km_roads FROM bc_roads;<br><br>km_roads<br>------------------<br>70842.1243039643 |
|
1 |
--<br>-- EXAMPLE 2: How large is the city of Prince George, British Columbia?<br>--<br>SELECT ST_Area(the_geom)/10000 AS hectares<br>FROM bc_municipality<br>WHERE name = 'PRINCE GEORGE';<br><br>hectares<br>------------------<br>32657.9103824927 |
|
1 |
--<br>-- EXAMPLE 3: What is the largest municipality in the province of British Columbia?<br>--<br>SELECT name,<br> ST_Area(the_geom)/10000 AS hectares<br>FROM bc_municipality<br>ORDER BY hectares DESC<br>LIMIT 1;<br><br> name | hectares<br>---------------+-----------------<br> TUMBLER RIDGE | 155020.025561 |
|
1 |
--<br>-- EXAMPLE 4: What is the length of roads fully contained within each municipality?<br>--<br>SELECT m.name,<br> sum(ST_Length(r.the_geom))/1000 as roads_km<br>FROM bc_roads AS r,<br> bc_municipality AS m<br>WHERE ST_Contains(m.the_geom,r.the_geom)<br>GROUP BY m.name<br>ORDER BY roads_km;<br><br> name | roads_km<br>----------------------------+------------------<br> SURREY | 1539.47553551242<br> VANCOUVER | 1450.33093486576<br> LANGLEY DISTRICT | 833.793392535662<br> BURNABY | 773.769091404338<br> PRINCE GEORGE | 694.375543691 |
Learn More: How to tune PostgreSQL database parameters to optimize performance
PostGIS extends PostgreSQL’s capabilities to handle geospatial data for solutions in real-world scenarios. From mapping to visualizations to logistics, PostGIS helps users to make informed decisions based on spatial insights.
PostGIS plays a role in location-based applications by enabling spatial data storage and analysis. For instance, a grocery delivery service could utilize PostGIS to optimize delivery for certain areas. A query that finds available drivers within a certain distance, facilitated by PostGIS’s spatial indexing, would significantly enhance the platform’s responsiveness.
PostGIS utilizes geospatial data analysis by offering advanced tools for operations such as identifying the nearest points or calculating areas within polygons. The functions are optimized through spatial indexing for quicker and better queries of complex geospatial analyses, improving overall performance.
PostGIS improves map rendering and visualization through efficient geospatial data handling. It allows for quickly retrieving map data to ensure smooth and responsive displays.
Related: Watch the PostgreSQL high-performance tuning and optimization webinar sessions
The following installation instructions assume one is using PostgreSQL version 12 on Linux/CENTOS-7, although any major version of postgres and OS can be used.
|
1 |
#<br># REDHAT/CENTOS DERIVATIVES, USING COMMUNITY POSTGRES<br>#<br>yum update -y<br>yum install -y postgis30_12 postgis30_12-client |
|
1 |
#<br># CREATE THE DATACLUSTER<br>#<br>/usr/pgsql-12/bin/postgresql-12-setup initdb |
|
1 |
#<br># SERVICE STARTUP<br>#<br>systemctl start postgresql-12 |
|
1 |
#<br># INSTALLING THE POSTGIS EXTENSION<br># Execute as UNIX account postgres<br>#<br>createdb mydatabase;<br>psql -c 'create extension postgis' mydatabase |
Note: Although it is on the roadmap, the Percona Distribution for PostgreSQL does not currently include PostGIS. But you should be able to install the community PostGIS packages to it with the pertinent YUM/RPM package switches.
The following provides a more complete installation of all PostGIS capabilities. Refer to the PostGIS documentation for more information.
|
1 |
--<br>-- if you want to install raster support<br>--<br>CREATE EXTENSION postgis_raster; |
|
1 |
--<br>-- if you want to install topology support<br>--<br>CREATE EXTENSION postgis_topology; |
|
1 |
--<br>-- if you want to install sfcgal support<br>--<br>CREATE EXTENSION postgis_sfcgal; |
|
1 |
--<br>-- if you want to install tiger geocoder<br>--<br>CREATE EXTENSION fuzzystrmatch<br>CREATE EXTENSION postgis_tiger_geocoder; |
|
1 |
--<br>-- if you want to install pcre, add the address standardizer extension<br>--<br>CREATE EXTENSION address_standardizer; |
|
1 |
--<br>-- list of all installed extensions<br>--<br>mydatabase=# select * from pg_available_extensions where name ~ '^postgis';<br> name | default_version | installed_version | comment<br>--------------------------+-----------------+-------------------+---------------------------------------------------------------------<br> postgis-3 | 3.0.1 | | PostGIS geometry, geography, and raster spatial types and functions<br> postgis_raster-3 | 3.0.1 | | PostGIS raster types and functions<br> postgis_sfcgal-3 | 3.0.1 | | PostGIS SFCGAL functions<br>postgis_tiger_geocoder-3 | 3.0.1 | | PostGIS tiger geocoder and reverse geocoder<br> postgis_topology-3 | 3.0.1 | | PostGIS topology spatial types and functions<br> postgis | 3.0.1 | 3.0.1 | PostGIS geometry, geography, and raster spatial types and functions<br> postgis_raster | 3.0.1 | 3.0.1 | PostGIS raster types and functions<br> postgis_sfcgal | 3.0.1 | 3.0.1 | PostGIS SFCGAL functions<br> postgis_tiger_geocoder | 3.0.1 | 3.0.1 | PostGIS tiger geocoder and reverse geocoder<br> postgis_topology | 3.0.1 | | PostGIS topology spatial types and functions |
There are literally hundreds of terabytes available online. Ironically, the most precious data one can get is free to download because it’s been generated by governments from all over the world for the public good.
Now it’s time to get some data. The most common format is “shapefile.”
PostGIS includes these two command-line utilities:
TIGER/Line files are a digital database of geographic features, such as roads, railroads, rivers, lakes, legal boundaries, census statistical boundaries, etc., covering the entire United States and are freely available here.
Once you’ve navigated to this website, you can download a multitude of fascinating pieces of data.

|
1 |
psql -U postgres <<_eof_<br>set ON_ERROR_STOP on<br><br>drop database if exists gis_demo;<br>create database gis_demo;<br><br>c gis_demo<br><br>create extension if not exists postgis;<br>create extension if not exists postgis_topology;<br><br>create schema gis;<br><br>create or replace function gis.findzipcode(<br>inout longitude double precision,<br>inout latitude double precision,<br>out zipcode varchar<br>) as<br>$$<br>begin<br>select zcta5ce10<br>into zipcode<br>from zipcodes, (values (st_makepoint(longitude,latitude)))t(p)<br>where st_contains(geom, p);<br>end;<br>$$<br>language plpgsql<br>security definer<br>set search_path=gis,public,topology;<br>_eof_ |
The shapefile is stored in a zip file. Unzip the entire archive before attempting an upload.
URL: https://www.census.gov/cgi-bin/geo/shapefiles/index.php
|
1 |
#<br># Execute the following, the actual arguments will vary<br>#<br>shp2pgsql -c -D -I tl_2019_us_zcta510.shp gis.zipcodes | psql -U postgres gis_demo |
Once the upload into the postgres database is complete, this is what you’ll get:
|
1 |
gis_demo=# d<br>List of relations<br> Schema | Name | Type | Owner<br>----------+-------------------+----------+----------<br> public | geography_columns | view | postgres<br> public | geometry_columns | view | postgres<br> public | spatial_ref_sys | table | postgres<br> topology | layer | table | postgres<br> topology | topology | table | postgres<br> topology | topology_id_seq | sequence | postgres<br>(6 rows) |
|
1 |
gis_demo=# dn<br>List of schemas<br> Name | Owner<br>----------+----------<br> gis | postgres<br> public | postgres<br> topology | postgres<br>(3 rows) |
|
1 |
gis_demo=# d zipcodes<br>Table "gis.zipcodes"<br> Column | Type | Collation | Nullable | Default<br>------------+------------------------+-----------+----------+---------------------------------------<br> gid | integer | | not null | nextval('zipcodes_gid_seq'::regclass)<br> zcta5ce10 | character varying(5) | | |<br> geoid10 | character varying(5) | | |<br> classfp10 | character varying(2) | | |<br> mtfcc10 | character varying(5) | | |<br> funcstat10 | character varying(1) | | |<br> aland10 | double precision | | |<br> awater10 | double precision | | |<br> intptlat10 | character varying(11) | | |<br> intptlon10 | character varying(12) | | |<br> geom | geometry(MultiPolygon) | | |<br>Indexes:<br>"zipcodes_pkey" PRIMARY KEY, btree (gid)<br>"zipcodes_geom_idx" gist (geom) |
|
1 |
--<br>-- getting a zipcode based upon the LAT/LON<br>--<br>gis_demo=# select * from gis.findzipcode(-87.798615,30.53711);<br> longitude | latitude | zipcode<br>------------+----------+---------<br> -87.798615 | 30.53711 | 36576 |
There are a couple of things to keep in mind when querying PostGIS-powered databases:
Because so much of our modern big data insights depend upon raw GIS data, directly querying a GIS database empowers one to create even more powerful and precise insights. PostGIS is an excellent way for the DEV, DBA, and SRA to learn all things GIS.
Have Fun!
PostGIS is included in Percona Distribution for PostgreSQL, which provides the best and most critical enterprise components from the open source community in a single distribution. And with Percona support for PostGIS, you can reliably run spatial data on PostgreSQL in production without fear of vendor lock-in.
Resources
RELATED POSTS