OpenStreetMap logo OpenStreetMap

pnorman's Diary

Recent diary entries

It’s difficult to write in all map style languages. A style written in JSON, like MapLibre, has a few extra pain points because JSON is not designed for editing by humans.

Some “common” style languages are

  • CartoCSS
  • Mapnik XML
  • MapCSS
  • MapServer
  • MapLibre GL/Mapbox GL

Some, like CartoCSS, are designed for human editing, while others, like Mapnik XML, serve as a lower-level language. MapLibre GL falls into this category of not being designed for editing by humans. MapLibre GL preprocessors like glug were designed to help with this, but none of them have taken off. Other style projects like openstreetmap-americana have taken a different route. Their developers have written a program in JavaScript that generates the style.

I’m taking a different route. I’m creating a language that uses minimal pre-processing of its input to produce MapLibre GL. I don’t aim to solve every difficulty with MapLibre GL, only the ones that impact me the most. The end result will be a pre-processing language

The biggest problems I encounter when writing MapLibre GL are

  1. No comments

    Comments are essential so other readers understand what’s written

  2. Everything has to be in one file.

    With large styles this is a burden. More than one file makes it easier to edit.

  3. Having to repeat definitions instead of using a variable.

    Something like a color or symbol definition might appear a dozen times in the style. If you want to change it, you need to make sure you got all the occurrences.

  4. Inability to make versions of the style in different colors.

    When you only want to change a few superficial elements of the style, you want to contain those changes to one file.

  5. Not having support for more colorspaces

    I work in perceptual colorspaces like Lch. It’s a lot of converting that the computer should automate.

What issues have you found when writing MapLibre GL styles?

This blog post explains how I handle a typical bug report for the new OSMF Shortbread tiles. Here, I focus on the “island” seems to be missing from “place_labels” report from SomeoneElse

After verifying that the report is correct, I set up my editor environment. It’s useful to have an environment that syntax highlights Jinja SQL files, as well as other files. I use a Visual Studio Code-based editor with the Better Jinja plugin.

The issue is in the place_labels layer. After checking Shortbread, I see that place=island should show at zoom 10 or higher, so there is a bug. Tilekiln creates tiles by reading definitions from shortbread.yaml, so I check there for the place_labels definition.

place_labels:
    description: Holds label points for populated places.
    fields:
        kind: Value of OSM place tag
        name: *name
        name_en: *name_en
        name_de: *name_de
        population: Value of OSM population tag
    sql:
    - minzoom: 4
        maxzoom: 14
        file: shortbread_original/place_labels.04-14.sql.jinja2

This file shows that for zooms 4 to 14, the SQL for the layer is in shortbread_original/place_labels.04-14.sql.jinja2. Since this file is in shortbread_original, osm2pgsql-themepark created it, and it remains unchanged.

SELECT
        ST_AsMVTGeom(geom, {{unbuffered_bbox}}, {{extent}}, {{buffer}}) AS way,
        name,
        name_de,
        name_en,
        kind,
        population
    FROM place_labels
    WHERE geom && {{bbox}}
        AND {{zoom}} >= minzoom
    ORDER BY population desc

There aren’t any obvious bugs in the SQL. There’s no filtering out of islands, so either the data isn’t making it into the place_labels table or it has the wrong zoom. The data is loaded by osm2pgsql, and shortbread.lua tells osm2pgsql how to do that.

themepark:add_topic('shortbread/places')

See full entry

Minutely Shortbread tiles

Posted by pnorman on 29 February 2024 in English. Last updated on 5 March 2024.

I’ve put up a demo page showing my work on minutely updated vector tiles. This demo is using my work for the tiles and the Versatiles Colorful stylesheet.

With this year being the year of OpenStreetMap vector maps I’ve been working on making vector tile maps that update minutely. Most maps don’t need minutely updates and are fine with daily or, at most, weekly. Minutely updates on OpenStreetMap.org are a crucial part of the feedback cycle where mappers can see their edits right away and get inspired to map more often. Typically a mapper can make an edit and see their edit when reloading after 90-180 seconds, compared to the days or weeks of most OSM-based services, or the months or years of proprietary data sources.

Updating maps once a week can be done with a simple architecture that takes the OSM file for the planet and turns it into a single file containing all the tiles for the world. This can scale to daily updates, but not much faster. To do minutely updates we need to generate tiles one-by-one, since they change one-by-one. When combined with the caching requirements for osm.org, this is something no existing software solved.

For some time I’ve been working on Tilekiln, a small piece of software which leverages the existing vector tile generation of PostGIS, the standard geospatial database. Tilekiln is written specifically to meet the unique requirements of a default layer on osm.org. Recently, I’ve been working for the OSMF at setting up minutely updated vector tiles using the Shortbread schema. A schema is a set of definitions for what goes in the vector tiles, and Shortbread is a CC0 licensed schema that anyone can use and there are existing styles for.

See full entry

I’ve been looking at how many tiles are changed when updating OSM data in order to better guide resource estimations, and have completed some benchmarks. This is the technical post with details, I’ll be doing a high-level post later.

Software like Tilemaker and Planetiler is great for generating a complete set of tiles, updated about once a day, but they can’t handle minutely updates. Most users are fine with daily or slower updates, but OSM.org users are different, and minutely updates are critical for them. All the current minutely ways to generate map tiles involve loading the changes and regenerating tiles when data in them may have changed. I used osm2pgsql, the standard way to load OSM data for rendering, but the results should be applicable to other ways including different schemas.

Using the Shortbread schemea from osm2pgsql-themepark I loaded the data with osm2pgsql and ran updates. osm2pgsql can output a list of changed tiles (“expired tiles”) and I did this for zoom 1 to 14 for each update. Because I was running this on real data sometimes an update took longer than 60 seconds to process if it was particularly large, and in this case the next run would combine multiple updates from OSM. Combining multiple updates reduces how much work the server has to do at the cost of less frequent updates, and this has been well documented since 2012, but no one has looked at the impact from combining tiles.

To do this testing I was using a Hezner server with 2x1TB NVMe drives in RAID0, 64GB of RAM, and an Intel i7-8700 @ 3.2 GHz. Osm2pgsql 1.10 was used, the latest version at the time. The version of themepark was equivalent to the latest version

The updates were run for a week from 2023-12-30T08:24:00Z to 2024-01-06T20:31:45Z. There were some interruptions in the updates, but I did an update without expiring tiles after the interruptions so they wouldn’t impact the results.

To run the updates I used a simple shell script

See full entry

Aggregating Fastly logs

Posted by pnorman on 1 September 2023 in English.

The Standard Tile Layer has a lot of traffic. On August 1st, a typical day, it had 2.8 billion requests served by Fastly, about 32 thousand a second. The challenges of scaling to this size are documented elsewhere, and we handle the traffic reliably, but something we don’t often talk about is the logging. In some cases, you could log a random sample of requests but that comes with downsides like obscuring low frequency events, and preventing some kinds of log analysis. Critically, we publish data that depends on logging all requests.

We query our logs with Athena, a hosted version of Presto, a SQL engine that, among features, can query files on an object store like S3. Automated queries are run with tilelog, which publishes files daily to generate published files on usage of the standard tile layer.

As you might imagine, 2.8 billion requests is a lot of log data. Fastly offers a number of logging options, and we publish compressed CSV logs to Amazon S3. These logs are large, and suffer a few problems for long-term use because they:

  1. contain personal information like request details and IPs, that, although essential for running the service, cannot be retained forever;
  2. contain invalid requests, making analysis more difficult;
  3. are large, being 136 GB/day; and
  4. become slow to query, being compressed gzip files with the only indexing being the date and hour of the request, which is part of the file path.

To solve these problems we reformat, filter, and aggregate logs which lets us delete old logs. We’ve done the first two for some time, and are now doing the third.

See full entry

Maxar usage over the last year

Posted by pnorman on 8 July 2023 in English. Last updated on 9 July 2023.

I was curious about the usage of Maxar over the last year, so did some quick work to see where it was used. To start, I used a Python 3 version of ChangesetMD to load the latest changesets into PostgreSQL, using the -g option to create geometries.

I then, with a bit of manual work, identified the changesets of the last year are those between 122852000 and 137769483. Using this, and knowledge of tags normally used with maxar, I created a materialized view with just the Maxar changesets

CREATE MATERALIZED VIEW maxar AS
SELECT id,
    num_changes,
    st_centroid(geom)
FROM osm_changeset
WHERE id BETWEEN 122852000 and 137769483
    AND (tags->'source' ILIKE '%maxar%' AND tags->'imagery_used' ILIKE '%maxar%');

This created a table of 2713316 changesets, which is too many to directly view, so I needed to get it by country.

I did this with the border data from country-coder

curl -OL 'https://raw.githubusercontent.com/rapideditor/country-coder/main/src/data/borders.json'
ogr2ogr -f PostgreSQL PG:dbname=changesetmd borders.json

This loaded a quick and dirty method of determining the point a country is in into the DB, allowing me to join the tables together

SELECT COALESCE(iso1a2, country), COUNT(*)
FROM maxar JOIN borders ON ST_Within(maxar.st_centroid, borders.wkb_geometry)
GROUP BY COALESCE(iso1a2, country)
ORDER BY COUNT(*) DESC;

See full entry

Tilelog country data

Posted by pnorman on 22 May 2023 in English.

I added functionality to tilelog to generate per-country usage information for the OSMF Standard Map Layer. The output of this is a CSV file, generated every day, which contains country code, number of unique IPs that day, tiles per second, and tiles per second that were a cache miss, all for each country code.

With a bit of work, I manipulated the files to give me the usage from the 10 countries with the most usage, for the first four months of 2023.

Tile usage per country by date

Perhaps more interesting is looking at the usage for each country by the day of week.

See full entry

The Operations Working Group is looking at what it take to deprecate HTTP Basic Auth and OAuth 1.0a in favour of OAuth 2.0 on the main API in order to improve security and reduce code maintenance.

Some of the libraries that the software powering the API relies on for OAuth 1.0a are unmaintained, there is currently a need to maintain two parallel OAuth interfaces, and HTTP Basic Auth requires bad password management practices. OAuth 2.0 libraries should be available for every major language.

We do not yet have a timeline for this, but do not expect to shut off either this year. Before action is taken, we will send out more notifications. Deprecation may be incremental, e.g., we may shut off creation of new applications as an earlier step.

What can you do to help?

If you are developing new software that interacts with the OSM API, use OAuth 2.0 from the start. Non-editing software can require authentication support, e.g. software that checks if you have an OSM login.

If you maintain existing software, then look into OAuth 2.0 libraries that can replace your OAuth 1.0a ones. We do not recommend implementing support for either protocol version “by hand”, as libraries are readily available and history has shown that implementing your own support is prone to errors.

If you do not develop software that interacts with the OSM API, this change will not directly impact you. You may need to update software you use at some point.

I have been developing Street Spirit, a new style using OpenStreetMap data. It uses Maplibre GL for client side rendering of MVTs generated by Tilekiln, which supports minutely updates using the standard osm2pgsql toolchain.

To focus style development, I have set its aims as being suitable for

  • use as a locator map,
  • to show off what can be done with OpenStreetMap data,
  • to be up-to-date with the latest OpenStreetMap data, and
  • using to orient a viewer to a location they are at.

Although not complete - if a style can ever said to be complete - it is at the point where there’s enough features to give the overall feel of the map, at least for zooms 12 and higher. Lower zooms are missing many features still, particularly roads and rail and some landcover and other fills.

Because the style has a more clearly defined purpose, I’ve been able to use more of the colour pallet than many other styles, particularly compared to styles designed for overlaying other data on top of.

I’ve set up a dev instance on one of my servers, using OSM data from 2023-02-27. Have an explore around.

Some of the bigger areas that need work are

  • Missing mid- and low-zoom features
  • Missing fills
  • A consistent set of POI icons
  • More POIs

If you’re interested in contributing to the work, let me know. Contributing will require some technical knowledge in the following areas

  • MapLibre GL style specification, focusing on layers and expressions, including data-driven expressions;
  • YAML, in particular appropriate indentation for arrays. MapLibre GL styles tend to feature deeply nested arrays; and
  • SQL for writing read-only PostGIS queries if modifying vector tiles.

See full entry

OpenStreetMap Carto release v5.7.0

Posted by pnorman on 11 January 2023 in English.

Dear all,

Today, v5.7.0 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on openstreetmap.org it will take couple of days before all tiles show the new rendering.

Changes include - Unpaved roads are now indicated on the map (#3399)

  • Country label placement improved, particularly for countries in the north (#4616)

  • Added elevation to wilderness huts (#4648)

  • New index for low-zoom performance (#4617)

  • Added a script to switch between script variations for CJK languages (#4707)

  • Ordering fixes for piers (#4703)

  • Numerous CI improvements

Thanks to all the contributors for this release, including wyskoj, tjur0, depth221, SlowMo24, altilunium, and cklein05, all new contributors.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v5.6.2…v5.7.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

OSM usage by country

Posted by pnorman on 22 November 2022 in English.

I gathered some statistics about usage of the website and tiles in 2022Q3.

I looked at total tile.osm.org usage, tile.osm.org usage from osm.org itself, osm.org visits, and osm.org unique visitors.

Here’s the data for the top 20 countries.

country osm.org tile requests total tile requests Website visits Website unique Visitors
DE 17.79% 7.78% 8.27% 7.98%
RU 12.23% 8.49% 2.47% 2.43%
US 8.72% 9.22% 13.11% 13.56%
PL 7.69% 4.99% 3.09% 2.80%
GB 4.85% 3.68% 4.42% 4.42%
FR 4.79% 7.00% 3.91% 3.94%
NL 3.62% 3.31% 2.17% 2.09%
IT 3.49% 3.46% 4.74% 4.86%
IN 2.64% 2.66% 3.67% 3.16%
CN 2.62% 0.79% 2.65% 2.72%
AT 2.03% 0.89% 0.98% 0.91%
UA 1.78% 1.98% 1.20% 1.21%
CH 1.41% 0.71% 0.83% 0.82%
CA 1.29% 1.59% 1.36% 1.39%
BE 1.29% 1.06% 1.10% 1.03%
ES 1.27% 2.41% 2.32% 2.39%
JP 1.10% 1.54% 1.74% 1.71%
AU 1.09% 0.92% 0.88% 0.82%
SE 0.91% 0.95% 0.87% 0.88%
FI 0.89% 0.74% 0.74% 0.71%

I’ve put the full data into a gist on github

OpenStreetMap Carto release v5.6.1

Posted by pnorman on 12 August 2022 in English.

Dear all,

Today, v5.6.1 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on the openstreetmap.org it will take couple of days before all tiles show the new rendering.

Changes include

  • Fixing rendering of water bodies on zooms 0 to 4

Thanks to all the contributors for this release.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v5.6.0…v5.6.1

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

OpenStreetMap Carto release v5.6.0

Posted by pnorman on 3 August 2022 in English.

Dear all,

Today, v5.6.0 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on the openstreetmap.org it will take couple of days before all tiles show the new rendering.

Changes include

  • using locally installed fonts instead of system fonts, for more up to date fonts;
  • changing tree and tree row colours to the same colour as areas with trees;
  • rendering parcel lockers; and
  • rendering name labels of bays and straights from z14 only, and lakes from z5

Thanks to all the contributors for this release including GoutamVerma, yvecai, ttomasz, and Indieberrie, new contributors.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v5.5.1…v5.6.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

OpenStreetMap Carto could use more help reviewing pull requests, so if you’re able to, please head over to Github and review some of the open PRs.

This is a bit less OpenStreetMap related then normal, but has to do with the Standard Tile Layer and an outage we had this month.

On July 18th, the Standard Tile Layer experienced degraded service, with 4% of traffic resulting in errors for 2.5 hours. A significant factor in the time to resolve the incident was a lack of visibility of the health status of the rendering servers. The architecture consists of a content delivery network (CDN) hosted by Fastly, backed by 7 rendering servers. Fastly, like most CDNs, offers automatic failover of backends by fetching a URL on the backend server and checking its response. If the response fails, it will shift traffic to a different backend.

A bug in Apache resulted in the servers being able to handle only a reduced number of connections, causing a server to fail the health check, diverting all load to another server. This repeated with multiple servers, sending the load between them until the first server responded to the health check again because it had zero load. Because the servers were responding to most of the manually issued health checks and we had no visibility into how each Fastly node was directing its traffic, it took longer to find the cause than it should have.

Our normal monitoring is provided by Statuscake, but this wasn’t enough here. Instead of increasing the monitoring, we wanted to make use of the existing Fastly healthchecks, which probe the servers from 90 different CDN points. Besides being a vastly higher volume of checks, this more directly monitors the health checks that matter for the service

During the incident, Fastly support provided some details on how to monitor health check status. Based on this guide, the OWG has set up an API on the tile CDN to indicate backend health, and monitoring to track this across all POPs.

See full entry

OpenStreetMap Carto Release v5.5.1

Posted by pnorman on 13 July 2022 in English.

Dear all,

Today, v5.5.1 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on the openstreetmap.org it will take couple of days before all tiles show the new rendering.

The one change is a bugfix to the colour of gates (#4600)

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v5.5.0…v5.5.1

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

OpenStreetMap Carto release v5.5.0

Posted by pnorman on 10 July 2022 in English.

Dear all,

Today, v5.5.0 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on the openstreetmap.org it will take couple of days before all tiles show the new rendering.

Changes include

  • Fixed colour mismatch of car repair shop icon and text (#4535)

  • Cleaned up SVG files to better align with Mapnik requirements (#4457)

  • Allow Docker builds on ARM machines (e.g. new Apple laptops) (#4539)

  • Allow file:// URLs in external data config and caching of downloaded files (#4468, #4153, #4584)

  • Render mountain passes (#4121)

  • Don’t use a cross symbol for more Christian denominations that don’t use a cross (#4587)

Thanks to all the contributors for this release, including stephan2012, endim8, danieldegroot2, and jacekkow, new contributors.

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v5.4.0…v5.5.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

I’m working on publishing a summary of sites using tile.osm.org and want to know what format would be most useful for people.

The information I’ll be publishing is requests/second, requests/second that were cache misses, and domain. The first two are guaranteed to be numbers, while the last one is a string that will typically be a domain name like www.openstreetmap.org, but could theoretically contain a poisoned value like a space.

The existing logs which have tiles and number of requests are formatted as z/x/y N where z/x/y are tile coordinates and N is the number of accesses.

My first thought was TPS TPS_MISS DOMAIN, space-separated like the existing logs. This would work, with the downside that it’s not very future proof. Because the domain can theoretically have a space, it has to be last. This means that any future additions will require re-ordering the columns, breaking existing usage. Additionally, I’d really prefer to have the domain at the start of the line.

A couple of options are - CSV, with escaping - tab-delimited

Potential users, what would work well with the languages and libraries you prefer?

An example of the output right now is

1453.99 464.1 www.openstreetmap.org  
310.3 26.29 localhost
136.46 39.68 dro.routesmart.com
123.65 18.54 www.openrailwaymap.org
107.98 0.05 www.ad-production-stage.com
96.64 1.78 r.onliner.by
91.42 0.16 solagro.org
87.83 1.53 tvil.ru
84.88 12.98 eae.opekepe.gov.gr
74.0 2.32 www.mondialrelay.fr
63.44 1.93 www.lightningmaps.org
63.22 14.01 nakarte.me
55.1 0.74 qualp.com.br
52.77 11.25 apps.sentinel-hub.com
46.68 4.07 127.0.0.1
46.3 1.96 www.gites-de-france.com
43.47 1.15 www.anwb.nl
42.46 10.52 dacota.lyft.net
41.13 6.63 www.esri.com
40.84 0.69 busti.me

The OpenStreetMap Foundation runs several services subject to usage policies.

If you violate the policies, you might be automatically or manually blocked, so I decided to write a post to help community members answering questions from people who got blocked. If you’re a blocked user, the best place to ask is in the IRC channel #osm-dev on irc.oftc.net. Stick around awhile to get an answer.

The most important question is which API is being used. For this, look at the URL you’re calling.

If the URL contains nominatim.openstreetmap.org, review the usage policy. The most common cause of being blocked is bulk geocoding exceeding 1 request per second. Going over this will trigger automatic IP blocks. These are automatically lifted after several hours, so stop your process, fix it, wait, and then you won’t be blocked.

If you’re using nominatim but not exceeding 1 request per second, to get help you should provide the URL you’re calling, the HTTP User-Agent or Referer you’re sending, the IP you’re requesting from, and the HTTP response code.

If you’re calling tile.openstreetmap.org or displaying a map, review the tile usage policy. The most common causes of being blocked is tile scraping or apps that don’t follow the usage policy.

To get help you should provide where the map is being viewed (e.g. an app, website, or something else), the HTTP User-Agent or Referer you’re sending, the IP you’re requesting from, and the HTTP response code. For a website, you can generally get this information through the browser’s developer tools. The tile.openstreetmap.org debug page will also show you you this information.

If you’re having problems with an app that you’re not the developer of, you’ll often need to contact them, as they are responsible for correctly calling the services.

OpenStreetMap Carto release v5.4.0

Posted by pnorman on 23 September 2021 in English.

Dear all,

Today, v5.4.0 of the OpenStreetMap Carto stylesheet (the default stylesheet on the OSM website) has been released. Once changes are deployed on the openstreetmap.org it will take couple of days before all tiles show the new rendering.

Changes include

  • Added a new planet_osm_line_label index (#4381)
  • Updated Docker development setup to use offical PostGIS images (#4294)
  • Fixed endline conversion issues with python setup scripts on Windows (#4330)
  • Added detailed rendering of golf courses (#4381, #4467)
  • De-emphasized street-side parking (#4301)
  • Changed subway stations to start text rendering at z15 (#4392)
  • Updated road shield generation scripts to Python 3 (#4453)
  • Updated external data loading script to support pyscopg2 2.9.1 (#4451)
  • Stopped displaying tourism=information with unknown information values
  • Switched the Natural Earth URL to point at its new location (#4466)
  • Added more logging to the external data loading script (#4472)

Thanks to all the contributors for this release including ZeLonewolf, kolgza, and map-per, new contributors

For a full list of commits, see https://github.com/gravitystorm/openstreetmap-carto/compare/v5.3.1…v5.4.0

As always, we welcome any bug reports at https://github.com/gravitystorm/openstreetmap-carto/issues

OpenStreetMap Standard Layer: Requests

Posted by pnorman on 29 July 2021 in English. Last updated on 30 July 2021.

This blog post is a version of my recent SOTM 2021 presentation on the OpenStreetMap Standard Layer and who’s using it.

With the switch to a commercial CDN, we’ve improved our logging significantly and now have the tools to log and analyze logs. We log information on both the incoming request and our response to it.

We log

  • user-agent, the program requesting the map tile;
  • referrer, the website containing a map;
  • some additional headers;
  • country and region;
  • network information;
  • HTTP protocol and TLS version;
  • response type;
  • duration;
  • size;
  • cache hit status;
  • datacenter;
  • and backend rendering server

We log enough information to see what sites and programs are using the map, and additional debugging information. Our logs can easily be analyzed with a hosted Presto system, which allows querying large amounts of data in logfiles.

I couldn’t do this talk without the ability to easily query this data and dive into the logs. So, let’s take a look at what the logs tell us for two weeks in May.

See full entry