OpenStreetMap

I’ve been doing some log analysis on requests to the OSMF-hosted standard tile layer on tile.openstreetmap.org. To do this I downloaded two hours worth of logs and loaded them into PostgreSQL to run some queries. The logs start at ###, and total 11GB compressed starting at 1600 UTC on 2021-02-25.

My main concern has been backend server load, so to analyze that I looked at cache misses - requests where the cache has to request a tile from the OSMF-operated backend servers. Typically the number of tiles requested is going to be five to ten times higher than the number of misses, but it will vary by zoom.

Total cache misses were 3437.4 per second.

The top five referers, as well as some interesting ones are

Referer domain Cache misses per second
None 1254.3
www.openstreetmap.org 418.7
www.openrailwaymap.org 16.0
apps.sentinel-hub.com 15.0
m.turkiye.gov.tr 13.1
localhost, on various ports 30.7
10.* IPs 14.2
Other 1675.4

The top sites vary by time and what parts of the world are awake, but most of the traffic is from the long tail of small sites, OpenStreetMap itself, or an app which should be sending a custom user-agent instead of a website with a referer.

For user-agents, I grouped different versions of some apps together. Like before, I’ve got some of the top ones, then a few interesting ones.

User-Agent Cache misses per second
MapProxy, all versions 281.8
QGIS, all versions 69.9
Fake FF 84 46.8
Marble, all versions 34.3
ArcGIS Client Using WinInet 32.8
StreetView, all versions 29.6
com.caynax.sportstracker, all versions 24.7
Maperitive, all versions 24.2
JOSM, all versions 22.9
Fake Chrome 25 22.2
Amazon CloudFront 13.5
OruxMaps, all versions 12.4
Fake FF 77 11.7
173A220003203F293A2E3C2A 10.7
cgeo 5.9
Other 597.1

The fake user-agents are in the process of being blocked now.

Like with sites, the long tail is a significant portion of the load. Substantial chunks come from OSM-related apps, FOSS geo-related apps (QGIS, Marble, cgeo), and the biggest source is caching proxies like MapProxy and Amazon CloudFront.

Overall, the usage is

Source Cache misses per second
OpenStreetMap website 418.7
Caching proxies 295.3
Other geospatial apps 91.3
Fakes 91.4
QGIS 69.9
OSM editing apps 52.5
Internal and testing IPs 44.9
Other websites 1719.5
Other apps 653.9
Location: 0.000, 0.000

Discussion

Log in to leave a comment