Exploiting Prague Open Data without API
Speaking the Czech Republic, Prague is an undoubted leader in open data publishing. However, there is no public API to explore/search existing datasets.
I wanted to download the ESRI Shapefile of the city urban plan that is divided into more than a hundred files (a file representing a cadastral area).
This becomes a piece of cake with Opera Developer tools and a bit of JavaScript code
let links = document.getElementsByClassName('open-data-icon-rastr open-data-link tooltipstered')
for (let link of links) {
if (link.href.indexOf('SHP') === -1) { continue;}console.log(link.href)
}
With the list saved to a file called list.txt
, wget --input-file=list.txt
will download the data. Followed by for f in *.zip; do unzip $f -d ${f%%.zip}; done
, each archive will be extracted in the directory called by its name.
Once done and assuming that the files are named consistently across the folders, ogr2ogr
will merge all of them into a single GeoPackage file, resulting in just four files. Not bad considered I began with more than a hundred × 4.
ogr2ogr -f "GPKG" pvp_fvu_p.gpkg ./PVP_fvu_p_Bechovice_SHP/PVP_fvu_p.shp
find -type f -not -path './PVP_fvu_p_Bechovice_SHP*' -iname '*fvu_p.shp' -exec ogr2ogr -update -append -f "GPKG" pvp_fvu_p.gpkg '{}' \;
ogr2ogr -f "GPKG" pvp_fvu_popis_z_a.gpkg ./PVP_fvu_p_Bechovice_SHP/PVP_fvu_popis_z_a.shp
find -type f -not -path './PVP_fvu_p_Bechovice_SHP*' -iname '*fvu_popis_z_a.shp' -exec ogr2ogr -update -append -f "GPKG" pvp_fvu_popis_z_a.gpkg '{}' \;
ogr2ogr -f "GPKG" pvp_pp_pl_a.gpkg ./PVP_fvu_p_Bechovice_SHP/PVP_pp_pl_a.shp
find -type f -not -path './PVP_fvu_p_Bechovice_SHP*' -iname '*pp_pl_a.shp' -exec ogr2ogr -update -append -f "GPKG" pvp_pp_pl_a.gpkg '{}' \;
ogr2ogr -f "GPKG" pvp_pp_s_a.gpkg ./PVP_fvu_p_Bechovice_SHP/PVP_pp_s_a.shp
find -type f -not -path './PVP_fvu_p_Bechovice_SHP*' -iname '*pp_s_a.shp' -exec ogr2ogr -update -append -f "GPKG" pvp_pp_s_a.gpkg '{}' \;
A boring task that would take me hours five years ago transformed into simple, yet fun, piece of work done in no more than half an hour.