OpenStreetMap

pnorman's Diary

Recent diary entries

I’ve been doing some log analysis on requests to the OSMF-hosted standard tile layer on tile.openstreetmap.org. To do this I downloaded two hours worth of logs and loaded them into PostgreSQL to run some queries. The logs start at ###, and total 11GB compressed starting at 1600 UTC on 2021-02-25.

My main concern has been backend server load, so to analyze that I looked at cache misses - requests where the cache has to request a tile from the OSMF-operated backend servers. Typically the number of tiles requested is going to be five to ten times higher than the number of misses, but it will vary by zoom.

Total cache misses were 3437.4 per second.

The top five referers, as well as some interesting ones are

Referer domain Cache misses per second
None 1254.3
www.openstreetmap.org 418.7
www.openrailwaymap.org 16.0
apps.sentinel-hub.com 15.0
m.turkiye.gov.tr 13.1
localhost, on various ports 30.7
10.* IPs 14.2
Other 1675.4

The top sites vary by time and what parts of the world are awake, but most of the traffic is from the long tail of small sites, OpenStreetMap itself, or an app which should be sending a custom user-agent instead of a website with a referer.

For user-agents, I grouped different versions of some apps together. Like before, I’ve got some of the top ones, then a few interesting ones.

User-Agent Cache misses per second
MapProxy, all versions 281.8
QGIS, all versions 69.9
Fake FF 84 46.8
Marble, all versions 34.3
ArcGIS Client Using WinInet 32.8
StreetView, all versions 29.6
com.caynax.sportstracker, all versions 24.7
Maperitive, all versions 24.2
JOSM, all versions 22.9
Fake Chrome 25 22.2
Amazon CloudFront 13.5
OruxMaps, all versions 12.4
Fake FF 77 11.7
173A220003203F293A2E3C2A 10.7
cgeo 5.9
Other 597.1

The fake user-agents are in the process of being blocked now.

Like with sites, the long tail is a significant portion of the load. Substantial chunks come from OSM-related apps, FOSS geo-related apps (QGIS, Marble, cgeo), and the biggest source is caching proxies like MapProxy and Amazon CloudFront.

Overall, the usage is

Source Cache misses per second
OpenStreetMap website 418.7
Caching proxies 295.3
Other geospatial apps 91.3
Fakes 91.4
QGIS 69.9
OSM editing apps 52.5
Internal and testing IPs 44.9
Other websites 1719.5
Other apps 653.9
Location: 0.000, 0.000

OpenStreetMap Survey by visits

Posted by pnorman on 21 February 2021 in English.

In my last post I looked at survey responses by country and their correlation with mappers eligible for a fee waver as an active contributor.

I wanted to look at the correlation with OSM.org views. I already had a full day’s worth of logs on tile.openstreetmap.org accesses, so I filtered them for requests from www.openstreetmap.org and got a per-country count. This is from December 29th, 2020. Ideally it would be from a complete week, and not a holiday, but this is the data I had downloaded.

Preview image

The big outlier is Italy. It has more visits than I would expect, so I wonder if the holiday had an influence. Like before, the US is overrepresented in the results, Russia and Poland are underrepresented, and Germany is about average.

Like before, I made a graph of the smaller countries.

Preview image

More small countries are above the average line - probably an influence of Italy being so low.

OSMF survey country results

Posted by pnorman on 17 February 2021 in English.

The board has started releasing results from their 2021 survey. I’ve done some analysis on the response rates by country.

There’s lots of data for activity on OSM by country, but for this I took the numbers from joost for how many “active contributors” there are according to the contributor fee waver criteria.

Preview image

For the larger countries, Russia is the most underrepresented country. This is not surprising, as they are underrepresented in other venues like the OSMF membership.

The US and UK are both slightly overrepresented in the survey, but less so than I would have expected based on other surveys and OSMF membership.

The smaller countries are all crowded, so I did a graph of just them.

Preview image

As with other surveys, Japan is underrepresented. Indonesia, although underrepresented is less underrepresented than I would have expected.

OpenStreetMap Carto v5.3.1

Posted by pnorman on 5 February 2021 in English.

Dear all,

Today, v5.3.1 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. There are no visual changes in this release.

Changes include - Natural Earth URL changed to directly point at the NACIS CDN - Added an option to the external data loader to grant SELECT permissions on the tables

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v5.3.0…v5.3.1

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

OpenStreetMap Carto release v5.3.0

Posted by pnorman on 29 January 2021 in English.

Dear all,

Today, v5.3.0 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on the openstreetmap.org it will take a few days before all tiles show the new rendering. It may take longer than normal because there are significant deployment-related changes.

  • External shapefiles for coastline and other data are now loaded into the database with a provided script.

  • The recommended indexes are now required. Attempting to render without them will result in abysmal performance.

  • amenity=embassy is no longer rendered, and office=diplomatic with diplomatic=embassy or diplomatic=consulate is instead.

  • Mini-roundabouts are rendered like a turning circle.

  • There is a new partial index for waterways

Anyone running their own install must run scripts/get-external-data.py and create the new indexes. People who are running with minutely diffs may be interested in https://github.com/openstreetmap/chef/issues/386.

Thanks to all the contributors for this release, including hiddewie, crimsondusk, pitdicker, and terminaldweller, new contributors.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v5.2.0…v5.3.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

Calculating label points with PostGIS

Posted by pnorman on 3 November 2020 in English.

A common task with OpenStreetMap data in PostGIS is to convert polygons to points to place labels. For simple polygons, the centroid can be used, but some shapes like C-shaped polygons, the centroid can lie outside the polygon, so ST_PointOnSurface is used. This function guarantees the point returned is within the polygon.

The only issue with ST_PointOnSurface is that it throws an exception on some invalid geometries. This isn’t a problem with a database created by a recent version of osm2pgsql which only creates valid geometries, but for older versions or other data loaders it’s unacceptable. This has lead people to writing wrapper functions that check the validity or catch the exceptions, but I’ve seen no benchmarking of the various options.

To benchmark the options, I loaded the planet data from 2020-10-12 and looked at named water polygons - those that matched ("natural" = 'water' OR waterway = 'riverbank') AND name IS NOT NULL. To make the system better reflect a tile server under load, I set max_parallel_workers_per_gather to 0 and jit to off. I then ran the query EXPLAIN ANALYZE SELECT function(way) FROM planet_osm_polygon WHERE ("natural" = 'water' OR waterway = 'riverbank') AND name IS NOT NULL;.

I tested with ST_Centroid, ST_PointOnSurface, ToPoint from postgis-vt-util, a function that checked validity before calling ST_PointOnSurface, a function that caught the exception from invalid geometries, and a function that used ST_Centroid for polygons with 4 corners and ST_PointOnSurface otherwise. The definitions are at the end of this post.

Function Time
ST_Centroid 277s
ST_PointOnSurface 408s
ToPoint 575s
point1 568s
point2 409s
point3 409s

Parallelism

I set max_parallel_workers_per_gather, but my test server has a lot of CPU cores. If I increased this value I was easily able to saturate my SSDs, and all queries took the same time. Still, even if you’re IO limited it’s a good idea to minimize CPU.

Conclusions

If you have a database with potentially invalid polygons, you should use a wrapper function that catches the exception rather than checks validity first. Although ST_Centroid is faster than ST_PointOnSurface, it’s not worth trying to use it in simple cases.

Function definitions

CREATE OR REPLACE FUNCTION public.topoint(g geometry)
RETURNS geometry
LANGUAGE plpgsql
IMMUTABLE PARALLEL SAFE
AS $function$
begin
    g := ST_MakeValid(g);
    if GeometryType(g) = 'POINT' then
        return g;
    elsif ST_IsEmpty(g) then
        -- This should not be necessary with Geos >= 3.3.7, but we're getting
        -- mystery MultiPoint objects from ST_MakeValid (or somewhere) when
        -- empty objects are input.
        return null;
    elsif (GeometryType(g) = 'POLYGON' OR GeometryType(g) = 'MULTIPOLYGON') and ST_NPoints(g) <= 5 then
        -- For simple polygons the centroid is good enough for label placement
        return ST_Centroid(g);
    else
        return ST_PointOnSurface(g);
    end if;
end;
$function$


CREATE OR REPLACE FUNCTION public.point1(g geometry)
RETURNS geometry
LANGUAGE sql
IMMUTABLE PARALLEL SAFE
AS $function$
SELECT CASE WHEN ST_IsValid(g) THEN ST_PointOnSurface(g) END;
$function$


CREATE OR REPLACE FUNCTION public.point2(g geometry)
RETURNS geometry
LANGUAGE plpgsql
IMMUTABLE PARALLEL SAFE
AS $function$
BEGIN
RETURN ST_PointOnSurface(g);
EXCEPTION WHEN OTHERS THEN
RETURN NULL;
END
$function$


CREATE OR REPLACE FUNCTION public.point3(g geometry)
RETURNS geometry
LANGUAGE plpgsql
IMMUTABLE PARALLEL SAFE
AS $function$
BEGIN
RETURN CASE WHEN ST_NPoints(g) <= 5 THEN ST_Centroid(g) ELSE ST_PointOnSurface(g) END;
EXCEPTION WHEN OTHERS THEN
RETURN NULL;
END
$function$

Cross-posted from my blog

I’ve been working on a new project, OpenStreetMap Cartographic. This is a client-side rendering based on OpenStreetMap Carto. This is an ambitious project, as OpenStreetMap Carto is an extremely complex style which shows a large number of features. The technical choices I’m making are designed so the style is capable of handling the load of osm.org with minutely updates.

I’ve put up a world-wide demo at https://pnorman.dev.openstreetmap.org/cartographic/mapbox-gl.html, using data from 2020-03-16, and you can view the code at https://github.com/pnorman/openstreetmap-cartographic.

Preview image

Incomplete parts

Only zoom 0 to 8 has been implemented so far. I started at zoom 0 and am working my way down.

Admin boundaries are not implemented. OpenStreetMap Carto uses Mapnik-specific tricks to deduplicate the rendering of these. I know how I can do this, but it requires the changes I intend to make with the flex backend.

Landuse, vegetation, and other natural features are not rendered until zoom 7. This is the scale of OpenStreetMap Carto zoom 8, and these features first appear at zoom 5. There are numerous problems with unprocessed OpenStreetMap data at these scales. OpenStreetMap Carto gets a result that looks acceptable but is poor at conveying information by tweaking Mapnik image rasterizing options. I’m looking for better options here involving preprocessed data, but haven’t found any.

I’m still investigating how to best distribute sprites.

Technology

The technology choices are designed to be suitable for a replacement for tile.osm.org. This means minutely updates, high traffic, high reliability, and multiple servers. Tilekiln, the vector tile generator, supports all of these. It’s designed to better share the rendering results among multiple servers, a significant flaw with renderd + mod_tile and the standard filesystem storage. It uses PostGIS’ ST_AsMVT, which is very fast with PostGIS 3.0. On my home system generates z0-z8 in under 40 minutes.

Often forgotten is the development requirements. The style needs to support multiple developers working on similar areas, git merge conflicts while maintaining an easy development workflow. I’m still figuring this out. Mapbox GL styles are written in JSON and most of the tools overwrite any formatting. This means there’s no way to add comments to lines of codes. Comments are a requirement for a style like this, so I’m investigating minimal pre-processing options. The downside to this will make it harder to use with existing GUI editors like Fresco or Maputnik.

Cartography

The goal of this project isn’t to do big cartography changes yet, but client-side rendering opens up new tools. The biggest immediate change is zoom is continuous, no longer an integer or fixed value. This means parameters like sizes can smoothly change as you zoom in and out, specified by their start and end size instead of having to specify each zoom.

Want to help?

Have a look at https://github.com/pnorman/openstreetmap-cartographic and have a go at setting it up and generating your own map. If you have issues, open an issue or pull request. Or, because OpenStreetMap Cartographic uses Tilekiln have a look at its issue list.

OpenStreetMap Carto release v5.0.0

Posted by pnorman on 19 March 2020 in English.

Dear all,

Today, v5.0.0 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on the openstreetmap.org it will take couple of days before all tiles show the new rendering.

Changes include - An update to Lua tag transforms, setting line vs polygon decisions for new tags

  • Added upper way_area limits to most features using ST_PointOnSurface to avoid performance problems from large polygons

  • Moved MSS files into their own directory

  • Removed rendering of power=cable features

  • Removed overlay pattern for natural=sand

  • Reduced landcover fading at mid-low zoom levels

  • Removed rendering of barrier=kerb

Thanks to all the contributors for this release.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v4.25.0…v5.0.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

OpenStreetMap Carto release v4.22.0

Posted by pnorman on 28 August 2019 in English.

Dear all,

Today, v4.22.0 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on the openstreetmap.org it will take couple of days before all tiles show the new rendering.

Changes include

  • Shop label fixes and use ST_PointOnSurface for building label placement

    This fixes some bugs and makes building label placement consistent with shop label placement.

  • Use cache-feature: true to improve performance of layers with attachments

  • Use retail colour fill on malls

  • Drop highway=steps from zoom 13

    This makes step rendering consistent with footways

  • Render place=locality from zoom 16

    This fits current usage of the tag and what it is normally tagged on.

  • Render natural=bay from linear ways

  • Render administrative boundary labels from relations only

  • Stop rendering natural=marsh

    It is recommended marshes are tagged with natural=wetland + wetland=marsh

  • Use a whitelist for barrier rendering, and render historic=citywalls like barrier=city_wall.

  • Support new Tibetan font name

    Noto has renamed Noto Sans Tibetan to Noto Serif Tibetan. The old name is still supported.

  • Code cleanups to increase reuse and improve consistency

Thanks to all the contributors for this release, including daveol and btwhite92, new contributors, and jeisenbe, a new maintainer.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v4.21.0…v4.22.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

Dear all,

Today, v4.21.0 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on the openstreetmap.org it will take couple of days before all tiles show the new rendering.

Changes include - Removed unused world_boundaries-spherical.tgz file from scripts

  • Switch to osmdata.openstreetmap.de for water & icesheet shapefiles

  • Started using ST_PointOnSurface for some label placements

  • Adjusted index for military areas

  • Adjusted starting zooms for labeling of administrative areas.

  • Revert rendering of healthcare key

  • Stop place some place labels when the objects become too big or at high zooms.

  • Only render capes as points and render them like other points.

  • Only render ferry lines from ways, not relations

  • Improved developer internal documentation

Thanks to all the contributors for this release including Nakaner, a new contributor.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v4.20.0…v4.21.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

Matrix of OSMF candidates and questions

Posted by pnorman on 1 December 2018 in English.

There’s a healthy number of candidates for OSMF board this year, so I made a matrix of candidates and the questions they were asked, and thought it might be useful to others too.

I’ll be using this and putting + or - in the squares so I can keep rough track of if I believe their positions make them good board candidates. I won’t just be adding them up in the end - not all questions are equally important to me - but something to organize my thoughts is useful.

Membership Working Group Updates

Posted by pnorman on 28 November 2018 in English.

The Membership Working Group (MWG) has been very busy lately with the roll-out of the fee waiver for lack of money transfer program. Rather than a quarterly update, here’s some news.

New countries

We’ve gotten more information about countries that PayPal doesn’t support and have added Central African Republic, Equatorial Guinea, Haiti, Kosovo, Libya, South Sudan, Timor-Leste, and Uzbekistan to the list of countries which qualify because there’s a lack of suitable money transfer facilities.

If you have information about countries not on the list at https://join.osmfoundation.org/fee-waiver-program/ where PayPal is not supported or the cost of transferring money is more than the cost of membership, please contact the MWG at membership@osmfoundation.org.

Application statistics

We welcome and congratulate our 19 new members from Iran, Kosovo and Liberia who have already joined the OSMF through the fee waiver program.

Over the first seven days, we had 45 applications. All the percentages below are based on these applications, and are rounded. The pace is still continuing, and as we work on our processes, it will be possible to automatically generate some statistics. Right now it’s manual, so I don’t have more recent numbers.

The money transfer applications are easy to deal with, as we have a well defined process. They only take a couple of minutes when all the information is there.

Of the first 45, 60% of the applications are from Iran, and 5% from Liberia, both countries on the list where money transfer is not practical. The remainder are from countries where we’re seeking more information from the applicants. Since the first seven days, we’ve applications from other countries.

25% of applications filled out the form incorrectly and were asked to fill it out again with a corrected username. We might need to look at providing more instructions on the form.

How can I help?

Contact membership@osmfoundation.org about joining the MWG. If you’re not interested in joining but have experience at writing instructions, documentation, and procedures, the stock responses to the form could use rewriting. If this is you, also contact membership@osmfoundation.org.

This is a cross-post of https://lists.openstreetmap.org/pipermail/osmf-talk/2018-November/005530.html

More work on Bolder

Posted by pnorman on 8 August 2018 in English.

This is a mirror of a post on my blog.

After the birds of a feather session Richard Fairhurst lead at State of the Map, I was motivated to continue some work on bolder, a client-side style I’ve been working on.

While I was working at the Wikimedia Foundation, I developed brighmed, a CartoCSS style using vector tiles. Wikimedia decided not to flip the switch to deploy the style, but the style is open source, so I can use it elsewhere. Making this decision, I spent a day implementing most of it in Tangram.

Bolder example image

What’s next?

I’ve got some missing features like service roads and some railway values to add, then I can look at new stuff like POIs. For that I’ll need to look at icons and where to fit them into colourspace.

There’s a bunch of label work that needs to be done, what I have is just a first pass, and some things like motorway names have big issues, and ref tags still need rendering. Label quality is of course a unending quest, but I should be able to get some big gains without much work.

Richard is planning to do some work on writing a schema, and if it works, I’d like to adopt it. At the same time, I don’t want to tie myself to an external schema which may have different cartographic aims, so I’ll have to see how that works out. Looking at past OpenStreetMap Carto changes to project.mml, I found that what would be breaking schema changes on a vector tile project are less common than I thought, happening about once every 4-6 months. Most of the schema changes that would have happened were compatible and could be handled by regenerating tiles in the background.

Bolder - Starting a new client-side OpenStreetMap style

Posted by pnorman on 30 April 2018 in English. Last updated on 1 May 2018.

I’ve started work on a new client-side style for OpenStreetMap data, and feel it’s reached the point where I can release it to the public. My goal is to make a style that shows a rich selection of the data OSM has, and to make use of most of the colour space, rather than a style designed for overlaying other data on top of.

As a new style, I’ve been able to approach a lot from scratch, looking at avoiding mistakes of previous projects, and using best practices while building on existing work. All the components are open-source, and no assumptions are made about using closed-source software or particular commercial solutions.

You can get the code on GitHub Bolder example image

Technical overview

The style is rendered with Tangram, which allows for client-side rendering. Server-side rendering is possible but is a secondary target. Closely coupled with the client-side style is a set of vector tile definitions, handled by Tegola, a vector tile server. It pulls from an osm2pgsql database in the OpenStreetMap Carto schema, with additional data like ocean polygons loaded in by a script.

Cartographic target

The goal of Bolder is to be a general-purpose style, filling a target similar to OpenStreetMap Carto, while also being a better “default” for people wanting an OSM map. Being a client-side style, it’s easier to turn off classes of features like some POIs if a map with fewer features is needed.

The style should still be useful for mapper feedback, and some ways will become more useful. Vector tiles can associate OSM feature IDs with objects in many cases, helping debugging “where did that label come from”.

Setup

The style has two arts that are installed, one for the vector tiles, and the other for displaying the client-side style. The documentation for both of them has been tested by users who hadn’t seen it before, so it should be possible to set up for anyone reasonably experienced in style authoring.

Limitations

As a new project, Bolder has limitations. The biggest limitation is that only a small number of features are rendered, and many things have to be added. I’ve also been doing lots of new stuff with Tegola, and have uncovered a number of critical bugs, most of which should be fixed next Tegola release.

Location: Quayside, New Westminster, Metro Vancouver Regional District, British Columbia, V3M 6H5, Canada

This is a repost of an entry on my blog.

Last post ended with downloading OpenStreetMap data. This post will leave the data aside and switch to downloading and building a style. There’s lots of styles available, but we’re going to use OpenStreetMap Carto, the current default on OpenStreetMap.org. Also, because we need software not packaged in Debian, that needs to be installed.

For the script, we’re going to assume that the carto binary is in the PATH. Unfortunately, this requires installation, which requires npm, which itself needs to be installed.

Given nodejs and npm is a huge headache of versions, the easiest route I’ve found is to install nvm, then install nodejs 6 with nvm install 6. CartoCSS is then installed with npm install -g carto.

The shell script starts off with some variables from last time.

#!/usr/bin/env bash

set -euf -o pipefail

OpenStreetMap Carto is hosted on Github, which offers the ability to download a project as a zip file. This is the logical way to get it, but isn’t usable from a script because the internal structure of the zip file isn’t easily predicted. Instead, we’ll clone it with git, only getting the specific revision needed.

OSMCARTO_VERSION="v4.6.0"
OSMCARTO_LOCATION='https://github.com/gravitystorm/openstreetmap-carto.git'
rm -rf -- 'openstreetmap-carto'
git -c advice.detachedHead=false clone --quiet --depth 1 \
  --branch "${OSMCARTO_VERSION}" -- "${OSMCARTO_LOCATION}" 'openstreetmap-carto'

Setting advice.detachedHead=false for this command avoids a warning about a detached HEAD, which is expected.

OpenStreetMap Carto sets the database name to be “gis”. There are various ways to override this for development, but in this case we want to override it for the generated XML file. Fortunately, the database name only appears once, as dbname: "gis" in project.mml. One way to override it would be to remove the line and rely on the libpq environment variables like PGDATABASE. Another is replacing “gis” with a different name. It’s not clear which is better, but I decided to go with replacing the name, using a patch which git applies.

export PGDATABASE='osmcarto_prerender'

git -C 'openstreetmap-carto' apply << EOF
diff --git a/project.mml b/project.mml
index b8c3217..a41e550 100644
--- a/project.mml
+++ b/project.mml
@@ -30,7 +30,7 @@ _parts:
     srs: "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
   osm2pgsql: &osm2pgsql
     type: "postgis"
-    dbname: "gis"
+    dbname: "${PGDATABASE}"
     key_field: ""
     geometry_field: "way"
     extent: "-20037508,-20037508,20037508,20037508"
EOF

With project.mml patched, it’s easy to generate the Mapnik XML, because CartoCSS was installed earlier.

carto -a 3.0.12 ‘openstreetmap-carto/project.mml’ > ‘openstreetmap-carto/project.xml’

Lastly, OpenStreetMap Carto needs some data files like coastlines. It comes with a script to download them, so we run it.

openstreetmap-carto/scripts/get-shapefiles.py

Taking all of this and re-arranging it as, we end up with the following script.

#!/usr/bin/env bash

set -euf -o pipefail

OSMCARTO_VERSION="v4.6.0"
OSMCARTO_LOCATION='https://github.com/gravitystorm/openstreetmap-carto.git'

rm -rf -- 'openstreetmap-carto'
git -c advice.detachedHead=false clone --quiet --depth 1 \
  --branch "${OSMCARTO_VERSION}" -- "${OSMCARTO_LOCATION}" 'openstreetmap-carto'
carto -a 3.0.12 'openstreetmap-carto/project.mml' > 'openstreetmap-carto/project.xml'

openstreetmap-carto/scripts/get-shapefiles.py

This is a repost of an entry on my blog.

To do something with OpenStreetMap data, we have to download it first. This can be the entire data from planet.openstreetmap.org or a smaller extract from a provider like Geofabrik. If you’re doing this manually, it’s easy. Just a single command will call curl or wget, or you can download it from the browser. If you want to script it, it’s a bit harder. You have to worry about error conditions, what can go wrong, and make sure everything can happen unattended. So, to make sure we can do this, we write a simple bash script.

The goal of the script is to download the OSM data to a known file name, and return 0 if successful, or 1 if an error occurred. Also, to keep track of what was downloaded, we’ll make two files with information on what was downloaded, and what state it’s in: state.txt and configuration.txt. These will be compatible with osmosis, the standard tool for updating OpenStreetMap data.

Before doing anything else, we specify that this is a bash script, and that if anything goes wrong, the script is supposed to exit.

#!/usr/bin/env bash

set -euf -o pipefail

Next, we put the information about what’s being downloaded, and where, into variables. It’s traditional to use the Geofabrik Liechtenstein extract for testing, but the same scripts will work with the planet.

PLANET_FILE='data.osm.pbf'

PLANET_URL='http://download.geofabrik.de/europe/liechtenstein-latest.osm.pbf'
PLANET_MD5_URL="${PLANET_URL}.md5"

We’ll be using curl to download the data, and every time we call it, we want to add the options -s and -L. Respectively, these make curl silent and cause it to follow redirects. Two files are needed: the data, and it’s md5 sum. The md5 file looks something like 27f7... liechtenstein-latest.osm.pbf. The problem with this is we’re saving the file as $PLANET_FILE, not liechtenstein-latest.osm.pbf. A bit of manipulation with cut fixes this.

CURL='curl -s -L'
MD5="$($CURL "${PLANET_MD5_URL}" | cut -f1 -d' ')"
echo "${MD5}  ${PLANET_FILE}" > "${PLANET_FILE}.md5"

The reason for downloading the md5 first is it reduces the time between the two downloads are initiated, making it less likely the server will have a new version uploading in that time.

The next step is easy, downloading the planet, and checking the download wasn’t corrupted. It helps to have a good connection here.

$CURL -o "${PLANET_FILE}" "${PLANET_URL}" || { echo "Planet file failed to download"; exit 1; }

md5sum --quiet --status --strict -c "${PLANET_FILE}.md5" || { echo "md5 check failed"; exit 1; }

Libosmium is a popular library for manipulating OpenStreetMap data, and the osmium command can show metadata from the header of the file. The command osmium fileinfo data.osm.pbf tells us

Header:
  Bounding boxes:
    (9.47108,47.0477,9.63622,47.2713)
  With history: no
  Options:
    generator=osmium/1.5.1
    osmosis_replication_base_url=http://download.geofabrik.de/europe/liechtenstein-updates
    osmosis_replication_sequence_number=1764
    osmosis_replication_timestamp=2018-01-15T21:43:03Z
    pbf_dense_nodes=true
    timestamp=2018-01-15T21:43:03Z

The osmosis properties tell us where to go for the updates to the data we downloaded. Despite not needing the updates for this task, it’s useful to store this in the state.txt and configuration.txt files mentioned above.

Rather than try to parse osmium’s output, it has an option to just extract one field. We use this to get the base URL, and save that to configuration.txt

REPLICATION_BASE_URL="$(osmium fileinfo -g 'header.option.osmosis_replication_base_url' "${PLANET_FILE}")"
echo "baseUrl=${REPLICATION_BASE_URL}" > 'configuration.txt'

Replication sequence numbers needed to represented as a three-tiered directory structure, for example 123/456/789. By taking the number, padding it to 9 characters with 0s, and doing some sed magic, we get this format. From there, it’s easy to download the state.txt file representing the state of the data that was downloaded.

REPLICATION_SEQUENCE_NUMBER="$( printf "%09d" "$(osmium fileinfo -g 'header.option.osmosis_replication_sequence_number' "${PLANET_FILE}")" | sed ':a;s@\B[0-9]\{3\}\>@/&@;ta' )"

$CURL -o 'state.txt' "${REPLICATION_BASE_URL}/${REPLICATION_SEQUENCE_NUMBER}.state.txt"

After all this has been run, we’ve got the planet, it’s md5 file, and the state and configuration that correspond to the download.

Combining the code fragments, adding some comments, and cleaning up the files results in this shell script

#!/usr/bin/env bash

set -euf -o pipefail

PLANET_FILE='data.osm.pbf'

PLANET_URL='http://download.geofabrik.de/europe/liechtenstein-latest.osm.pbf'
PLANET_MD5_URL="${PLANET_URL}.md5"
CURL='curl -s -L'

# Clean up any remaining files
rm -f -- "${PLANET_FILE}" "${PLANET_FILE}.md5" 'state.txt' 'configuration.txt'

# Because the planet file name is set above, the provided md5 file needs altering
MD5="$($CURL "${PLANET_MD5_URL}" | cut -f1 -d' ')"
echo "${MD5}  ${PLANET_FILE}" > "${PLANET_FILE}.md5"

# Download the planet
$CURL -o "${PLANET_FILE}" "${PLANET_URL}" || { echo "Planet file failed to download"; exit 1; }

md5sum --quiet --status --strict -c "${PLANET_FILE}.md5" || { echo "md5 check failed"; exit 1; }

REPLICATION_BASE_URL="$(osmium fileinfo -g 'header.option.osmosis_replication_base_url' "${PLANET_FILE}")"
echo "baseUrl=${REPLICATION_BASE_URL}" > 'configuration.txt'

# sed to turn into / formatted, see https://unix.stackexchange.com/a/113798/149591
REPLICATION_SEQUENCE_NUMBER="$( printf "%09d" "$(osmium fileinfo -g 'header.option.osmosis_replication_sequence_number' "${PLANET_FILE}")" | sed ':a;s@\B[0-9]\{3\}\>@/&@;ta' )"

$CURL -o 'state.txt' "${REPLICATION_BASE_URL}/${REPLICATION_SEQUENCE_NUMBER}.state.txt"

OSMF Board election manifesto

Posted by pnorman on 25 November 2017 in English.

I’m Paul Norman, OSM user pnorman. I’ve been mapping since 2010, and involved in other facets of OpenStreetMap since 2011. For the last three years, I’ve been on the OSMF board, and am running for re-election. During my time I’ve seen the board grow in productivity, the finances become more stable, and us make good strides in transparency.

Outside the board, I’m also involved with the OSMF on the Data Working Group, License Working Group, and Membership Working Group. As a software developer, I’m a maintainer of OpenStreetMap Carto and osm2pgsql, as well as being involved in many parts of rendering toolchain.

In my work life I’m an independent software developer, working on map rendering, cartography, and PostGIS for clients. My main contract right now is with Wikimedia Foundation, as the developer on their maps team. In the past I’ve worked for CartoDB, Mapquest, and other companies.

Looking back at what I put in my 2014 manifesto, I’m moderately pleased with the progress we’ve made in both transparency and productive board meetings. Neither are perfect, but they’re a vast improvement over three years. Overall, I’m satisfied with my time on the board. I accomplished some of what I wanted to, and think my manifesto desires were realistic.

My concerns are now

Conflicts of interest

6/7 board members work with OSM somehow in their jobs. This includes four with employers who sell services based on OSM data and can easily run into conflicts of interest. We are not managing this, which might have worked in the past, but is not a good practice. There’s stuff we need to set up like having an email discussion out of sight of the people with conflicts. Right now it’s considered acceptable for a board member to take part in discussions where they have a conflict of interest. Clear rules would also protect board members from pressure from their employer.

On a working group whenever there’s occasionally been an intersection between my work and the WG. In these cases I’ve removed myself from the discussion. This is what we should all be doing on the board.

Unfortunately, as someone who is paid to work with OSM data, I run into conflicts of interest myself, but in practice, I have less than most with the nature of who I work for.

Support, but not control

The job of the OSMF board is to support the mappers building the map, but not control them. I worry we are losing sight of that, and people increasingly want to exert control and consider the mappers secondary. We need to protect the ability for people to independently do activities, even if it’s not something the board agrees with.

Volunteer capacity

A lack of volunteers was an issue when I ran three years ago. It’s a bit better, but still one of the biggest issues facing the OSMF. Working groups need more people. A growing number of members have been attending board meetings, but I’d like to see multiple ones at every meeting. We need good people on the board, but we also need an active membership who are interested in what we do, watch us, what we do, track that we deliver, and offer appreciation in return.

Location: Glenbrook, Glenbrooke North, New Westminster, Metro Vancouver Regional District, British Columbia, V3L 1V3, Canada

OpenStreetMap Carto release v4.3.0

Posted by pnorman on 17 September 2017 in English.

Dear all,

Today, v4.3.0 of the openstreetmap-carto stylesheet (the default stylesheet on openstreetmap.org) has been released.

Changes include

  • Moving ford and emergency phone to a new tagging scheme
  • Moving natural=tree to higher zoom level (z18+)
  • Changing embassy color to brown
  • Rendering name for waterway=dock
  • The same line wrap of amenities for all zoom levels
  • Fixing combined railway/highway ordering regression
  • Fixing line wrapping bug in Docker
  • Some documentation and code cleaning
  • Improve ferry line text legibility
  • Hide small theme parks and zoos
  • Use solid lines for admin borders at low zooms

Thanks to all the contributors for this release, including stevenLAD, a new contributor.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v4.2.0…v4.3.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

OpenStreetMap Carto release v3.2.0

Posted by pnorman on 17 April 2017 in English.

Dear all,

Today, v3.20 of the openstreetmap-carto stylesheet (the default stylesheet on openstreetmap.org) has been released.

Changes include

  • Render aeroway terminal buildings like other buildings
  • Removed rendering of landuse=farm
  • Added rendering for arts centre, fitness centre, plant nursery, mixed lift aerialways
  • Rendering for fens changed
  • Typography for point road-related features, addresses, and water features changed
  • Removed rendering of waterway=canal as an area
  • Take text properties of roads under construction from the type of road they will be

Thanks to all the contributors for this release including Richard Fairhurst and jnachtigall, new contributors.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v3.2.0…v3.1.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues.

OpenStreetMap Carto release v3.1.0

Posted by pnorman on 29 January 2017 in English.

Dear all,

Today, v3.1.0 of the openstreetmap-carto stylesheet (the default stylesheet on openstreetmap.org) has been released.

Changes include

  • Added coffee shop rendering
  • Added health clinic rendering
  • Adjusted place label typography
  • Road shield rendering improvements
  • Internal code cleanups

Thanks to all the contributors for this release.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v3.0.1…v3.1.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues.