OpenStreetMap

Several people have written on the subject before: when you look at something like the evolution of road network length in OSM, the shape of the curve can tell you something about how complete the network is (on the condition that there are enough local mappers).

This graph shows this evolution for the main roads in Flanders.

network growth full size

You can clearly see that the larger roads were mapped faster than the smaller roads. (note: there is a bug in the OSM-history-importer which prevents deleted objects from being removed from a snapshot. This could explain the continued slight growth of main roads. When people improve roads, they will often delete small portions of them.)

Assuming they are all kind of complete now, you can show the evolution of length as a percentage of current length. This shows quite clearly that there are “mapping priorities”: the 60% completion mark comes much sooner for motorways then it does for tertiaries.

network completeness full size

While this all sounds quite obvious, it really isn’t if you look at the map of road evolution in Flanders. From the very beginning of mapping, contributors have been interested in small roads as well as main roads.

Road completion Full size link. Colors: black: main roads, yellow: minor roads, green: slow roads.

If we extend our view to a wider range of roads, we can see that the main roads in general got mapped first, but minor roads soon came to dominate over them. Service roads, tracks and paths (footway, path, steps, bridleway, pedestrian) tell their own story.

network length full size

(Note: construction and proposed roads are removed from further graphs. I checked taginfo for alternative tagging styles, but they are also quite rare)

Because these last types of roads haven’t reached their final form yet, we’ll show the yearly growth rate. As this growth was explosive in the first years, we’ll start in 2012.

growth rate full size

The graph clearly shows that main roads and minor roads aren’t really growing anymore. However, the graphs for service roads, paths and tracks seem to level off in 2014. In fact, paths and tracks go up in 2016. In turn, that means there is a lot of mapping left to do. It is surprising to me that this holds for tracks too, as they can be mapped more easily from aerial imagery only. Open data sources of paths and high resolution aerial imagery (both provided by AGIV) could explain the upshot in the mapping of paths and tracks. Other explanations might be succesful relations with the GR and Trage Wegen organisations, or increased contribution triggered by data use.

Network growth versus amount of work

One more thing I do want to share now is the amount of work that is being done. While network completeness was achieved quite fast for main roads, that does not mean that people stopped caring after it was finished. In the animated map or primaries, trunks and motorways below, gray means “existing” and black means “been worked on this month”.

primary and bigger map full size

These edits can be anything, but here are two examples: work on naming roads and on speed limits. From the beginning of the project, most residential roads were mapped with a name. Length of unnamed residentials started decreasing as soon as 2012. It will likely never reach zero, as many small bits and pieces are hard to assign to any one street. Also, there are in fact roads that do not have a name.

named residential roads full size

For speed limits, the proportion that has a limit is much lower. Total length of untagged roads only started decreasing in 2014. This tagging is probably slower because it isn’t as important for routing and is sometimes seen as a consequence of road classification and location.

speed limits full size

Measuring edits

These graphs compare the added length for main road types (right) and the number of edits by road type (left). It is quite clear that mapping new roads peaked as early as 2008, but the amount of work done on these roads has in fact only gone up until 2014.

edits and growth full size

(Note: here, the number of edits is the sum of the number of days a certain way has been edited. The category in which it shows is the last main tag for that day.)

These two graphs show the type of changes for primary and tertiary roads. Traditionally, geometry changes are the most important. As time goes by, their importance starts to lower, and editing tags becomes more important.

edits mains full size

In a more general sense, this holds true too. The amount of edits peaks much later than the adding of new roads. In fact, for most road types, it doesn’t seem to go down at all.

edits and growth full size

What’s next?

As usual, I’m torn between answering more and more questions with the data, or scaling it up to more areas. Luckily, for your basic statistics needs, more and more options are finally popping up. See the road statistics provided by Mapbox, Steve Coast or the Missing Maps.

In the case of road network completeness, some efforts have been made to compare current OSM length to CIA stats to measure map completeness. This is problematic, because even if governments have decent stats, they are by their own local definition. Hence the comparison might be off. In the case of Flanders, we have a single, very good source for road lengths. One of the things I want to do next, is to compare local lengths in OSM and official data. This could show is where OSM is probably not finished yet. But you can also calculate this based on the shape of the curves we’ve seen before. If both approaches give similar results, that would clearly imply that you do not need external datasources to evaluate OSM data completeness.

Another thing is that we have noticed many new mappers first starting to map local paths. I’d like to see if this is a real evolution.

By focusing on road length, you measure both network completeness and level of detail. But neither very well. From a perspective of network completeness, you would have to discount things like cycleways that are mapped as separate ways, or only count dual carriageways once. An analysis detecting really new geometries would do that. I’m planning to do something like that “soon”. On the other hand, from a perspective of level of detail road length lacks subtlety. Take the example of cycleway networks. You would have to count all highway=cycleway, but also all the roads that have cycleway tags as part of the cycle network too.

But I told myself not to write articles that are too long to read in one go :) I might have failed.

Bonus: more animated maps

Because they are fun to make and to watch, here are some more animated maps.

Showing all major roads

Overlaying OSM on top of official road data (Wegenregister), to show where the map is complete

Focusing on “slow roads” (in green)

All data in this article copyright OpenStreetMap contributors, free to reproduce anywhere if source included. Download processed data here.

Discussion

Comment from PlaneMad on 16 February 2017 at 11:35

This is a beautiful post Joost, with some very nice approaches at evaluating the completeness of the map.

Wondering if you were able to find a pattern in what type of features were more actively mapped over time. Considering an active mapping community and declining edits on road features, was there a corresponding increase in edits to other types like addresses or turn restrictions?

From personal experience it seems like there is a very definite order in what features get completed on the map based on complexity and quantity of data to add: roughly roads, railways, street names, natural features, amenities, POIs, buildings, boundaries, navigation data, addresses. So maybe if the most heavily edited feature is addresses, it might point to more basic features like roads being more complete?

Also maybe numbers alone would never be able to tell objectively how ready a map is, the ultimate test might be to actually use the map successfully to navigate in the real world :)

Comment from joost schouppe on 16 February 2017 at 16:01

Thanks @PlaneMad!

Well, in fact, what I wanted to show is that “mapping priorities” do exit, but they are not absolute. Yes, the most important stuff gets mapped first, but as they get mapped, some other people are already working on more details. Which is great, because by the time the “big stuff” is ready, the smaller stuff has already found its data model.

I think part of your answer is already there: yes, as geometry edits go down, tag edits seem to increase. And then we’re talking about details. Turn restrictions are tricky, as they are relations and the osm-history-importer doesn’t do that.

In fact, using the scripts I have lying around, I can give you the numbers for all the things you mention. I’ll get around to writing about them; but publish them here asap I have the splitter and importer in a local Vagrant instance, and post processing is just based on flatfiles using SPSS scripts (which are easy to read and translate to something more open). I’ll try to find out how to share that. Then it shouldn’t be very hard to replicate stuff like this.

Comment from Glenn Plas on 17 February 2017 at 12:05

Very nice work Joost, I love stats. Tx for the work. You seem to have plenty of bloggin time ;-)

Comment from joost schouppe on 18 February 2017 at 09:02

Thanks Glenn :) Wednesday is OSM day, that’s when this happens. However recently most of that time goes to organizing OSM stuff, rather than analysis.

Log in to leave a comment