Using PostgreSQL To Update Outdated Map Links

We’ve rolled out completely new map GUI at edpp.cz built on top of OpenLayers 3. It looks great and has lots of functions both for BFU and power users. The only pitfall that came with moving away from OpenLayers 2 were remarkable differences in zoom levels between the old map and the new one.

Each of our maps is defined by our admins (center, zoom level, layers) at the map creation. Lots of links calling different views of map are created as well. They take form of http://edpp.cz/some-map?0=0&1=0...zoom=5. That zoom=<Number> started causing troubles immediately after the map switch. No way my workmates would update them one by one as there were ~4,500 of them. Sounds like a task for little bit of regular expressions and some SQL updates.

UPDATE table
    SET column = regexp_replace(column, 'zoom=\d', 'zoom=' || subquery.zoom, 'g')
    FROM (
        SELECT regexp_replace(
            substring(column from 'zoom=\d'),
            'g')::integer + 2 AS zoom, guid
        FROM table) AS subquery
    WHERE column ~ 'zoom=\d'
        AND table.guid = subquery.guid

That’s what I’ve come up with. It basically extracts the zoom level from the link, adds number two to its value and writes it back to the string.

PostGIS Case Study: VozejkMap Open Data (Part I)

VozejkMap.cz is a Czech open data iniatitive that collects data about wheelchair accessible places, e.g. pubs, toilets, cafes etc. As part of being open, they offer a JSON data download. JSON is a great text format, not so great spatial format (leaving GeoJSON aside) though. Anyway, nothing that PostGIS wouldn’t be able to take care of.

Let’s get some data

Using curl or wget, let’s download the JSON file:

wget -O /tmp/locations.json http://www.vozejkmap.cz/opendata/locations.json

We need to split them into rows to load each point into one row:

sed -i 's/\},{/\n},{/g' /tmp/locations.json

If you peep into the file, you’ll see lots of unicode characters we don’t want to have in our pretty little table. Here’s how we get rid of them:

echo -en "$(cat /tmp/locations.json)"

Let’s load the data

Let’s just be nice and leave the public schema clean.

CREATE SCHEMA vozejkmap;
SET search_path=vozejkmap, public;

Load the data:

CREATE TABLE vozejkmap_raw(id SERIAL PRIMARY KEY, raw text);
COPY vozejkmap_raw(raw) FROM '/tmp/locations.json' DELIMITERS '#' ESCAPE '\' CSV;

A few notes:

  1. I’m using /tmp folder to avoid any permission-denied issues when opening the file from psql.
  2. By setting DELIMITERS to # we tell PostgreSQL to load whole data into one column, because it is safe to assume there is no such character in our data.
  3. ESCAPE needs to be set because there is one trailing quote in the dataset.

Let’s get dirty with spatial data

Great, now what? We loaded all the data into one column. That is not very useful, is it? How about splitting them into separate columns with this query? Shall we call it a split_part hell?

                raw, 'title:', 2
            ',location_type:', 1
    ) AS title,

                raw, 'location_type:', 2
            ',description:', 1
    )::integer AS location_type,

                raw, 'description:', 2
            ',lat:', 1
    ) AS description,

    cast( trim(
                raw, 'lat:', 2
            ',lng:', 1
    ) AS double precision) AS lat,

    cast( trim(
                raw, 'lng:', 2
            ',attr1:', 1
    )  AS double precision) AS lng,

                raw, 'attr1:', 2
            ',attr2:', 1
    )::integer AS attr1,

                raw, 'attr2:', 2
            ',attr3:', 1
    ) AS attr2,

                raw, 'attr3:', 2
            ',author_name:', 1
    ) AS attr3,

                raw, 'author_name:', 2
            ',}:', 1
    ) AS author_name

FROM vozejkmap_raw;

It just splits the JSON data and creates table out of it according to the VozejkMap.cz data specification. Before going on we should create a table with location types to join their numeric codes to real names:

CREATE TABLE location_type (
    id integer PRIMARY KEY,
    description varchar(255)

INSERT INTO location_type VALUES(1, 'Kultura');
INSERT INTO location_type VALUES(2, 'Sport');
INSERT INTO location_type VALUES(3, 'Instituce');
INSERT INTO location_type VALUES(4, 'Jídlo a pití');
INSERT INTO location_type VALUES(5, 'Ubytování');
INSERT INTO location_type VALUES(6, 'Lékaři, lékárny');
INSERT INTO location_type VALUES(7, 'Jiné');
INSERT INTO location_type VALUES(8, 'Doprava');
INSERT INTO location_type VALUES(9, 'Veřejné WC');
INSERT INTO location_type VALUES(10, 'Benzínka');
INSERT INTO location_type VALUES(11, 'Obchod');
INSERT INTO location_type VALUES(12, 'Banka, bankomat');
INSERT INTO location_type VALUES(13, 'Parkoviště');
INSERT INTO location_type VALUES(14, 'Prodejní a servisní místa Škoda Auto');

Let’s build some geometry column, constraints and indexes. And don’t forget to get rid of all the mess (the vozejkmap_raw table).

DROP TABLE vozejkmap_raw;
-- 4326 geometry is not very useful for measurements, I might get to that next time
ALTER TABLE vozejkmap ADD COLUMN geom geometry(point, 4326);
ALTER TABLE vozejkmap ADD CONSTRAINT loctype_fk FOREIGN KEY(location_type); REFERENCES location_type(id);

UPDATE vozejkmap SET geom = ST_SetSRID(ST_MakePoint(lng, lat), 4326);

And here we are, ready to use our spatial data!

Feel free to grab the code at GitHub.

PostgreSQL Remote Access

PostgreSQL is set to listen only to connections coming from localhost by default. I guess that’s fine as far as you don’t need access to the database from anywhere else (like your work network). If you do, you need to log via SSH or use some online database management tool (go for Adminer and forget about anything called php[pg|my]admin). Or you can set it up to access connections from other locations.

You need to:

  1. set listen_addresses to * in your postgres.conf. That does not mean anyone can connect to your database, that means that the server will listen to connections coming from any available IP interface.
  2. insert new entry into pg_hba.conf looking like this: host database user xxx.xxx.xxx.xxx md5. Now we’re saying we only want connections coming from IP xxx.xxx.xxx.xxx accepted.
  3. Add rule allowing the database server access to iptables. Number 5 says it will be the fifth rule in the order. It must come before the final REJECT ALL rule if present.

    iptables -I INPUT 5 -p tcp --dport 5432 -s xxx.xxx.xxx.xxx -j ACCEPT 4. Just to be sure noone else is able to connect, reject all on port 5432.

    iptables -I INPUT 6 -p tcp --dport 5432 -j REJECT

You’re set to remotely connect to your database server.