OpenStreetMap

daniel-j-h's Diary Comments

Diary Comments added by daniel-j-h

Post When Comment
GSoC 2023 - OSRM Bindings Project [Pt. 1]

Hey there, I was just pointed to your diary post here by a friend.

I worked on OSRM between ~2015-2019ish and back then created an ABI/API stable C wrapper which also comes with bindings to other languages (via ctypes); mostly for playing around and seeing where this goes.

https://github.com/daniel-j-h/libosrmc

Now many years later this is quite outdated; but maybe it helps for inspiration.

RoboSat ❤️ Tanzania

Re. https://www.openstreetmap.org/user/daniel-j-h/diary/44321#comment45014

2- Convert the OSM GeoJson file to a binary Geotif using the following code: 3- Convert the binary image to tiles using gdal2tiles.py (the same as step 1)

The rs rasterize command rasterizes GeoJSON features into Slippy Map tiles.


Re. https://www.openstreetmap.org/user/daniel-j-h/diary/44321#comment45100

Once you have the predicted tiles you can serve them via http and point e.g. a Mapbox GL JS map to it. There is an example for the compare maps and the rs serve tool here:

https://github.com/mapbox/robosat/blob/1d0cf506cde4600ab7063c238f7f9e25d65ba611/robosat/tools/templates/map.html

You can serve the predicted tiles as simple as python3 -m http.server 5000


Re. https://www.openstreetmap.org/user/daniel-j-h/diary/44321#comment45254

You check out tiles where your predictions are not in sync with OpenStreetMap: where we predict e.g. a building but OpenStreetMap says there should be none. Because OpenStreetMap is not “complete” your model will

  • either predict a building where there is a building in the aerial imagery but not in OpenStreetMap: in this case your metrics will be lower since we count this as an error based on “ground truth” OpenStreetMap, or

  • your model predicts a building where there is no building in the aerial imagery or in OpenStreetMap. This can e.g. happen if you never train on images with swimming pools but now predict on images with swimming pools. In this case you can add these tiles into your dataset with a all-background mask nudging the model into the right direction and learning not to predict swimming pools as buildings.


Hope this helps; sorry for the delay it’s probably best if you join the robosat channel on the osmus Slack for quicker responses; there are quite a few folks there who are happy to help with questions.

Servus, Bayern: The robots are coming!

If you want to fine-tune pre-trained Robosat models then the weights saved in the .pth file have to exactly match the model architecture we use in Robosat.

Right now the model architecture is a encoder-decoder with skip connections. In addition we use a pre-trained Resnet50 for the encoder. Here are more details and paper references:

It is possible to

  • train this robosat model, save its weights in a .pth checkpoint file, and then load this file back when fine-tuning or for prediction

  • use a different pre-trained encoder e.g. use a Resnet18 or Resnet34 if you want a smaller and faster encoder. Then train the robosat model with this encoder and save the resulting weights into a .pth checkpoint file again

There are also some open pull requests where we experiment with different architectures:

these architecture changes are not compatible with old .pth checkpoint files which is one of the reasons we didn’t change the core architecture so far.

Regarding your second question about estimating the dataset size: it depends on your use-case: the zoom level you want to work with, the geographical area (lots of variety on the planet vs. a single city or country), how good your ground truth labels are and if you want to invest time to do manual dataset curation or refinement, and so on.

As a very rough guideline I recommend a couple thousand 512x512 px image tiles at least; then give it a try and see if it works. But as usual the more data the better.

Hope that helps,
Daniel

Servus, Bayern: The robots are coming!

Do your images roughly look like the Bavaria images? Do buildings roughly look the same? Then I would just give it a try and see what you will get out.

There is also an option in the rs train tool to read in a checkpoint and fine-tune it to your specific dataset - maybe that’s an option if you can get your hands on more data (30 buildings sounds a bit low).

If you want to try it make sure to convert your dataset into Slippy Map tiles on zoom level 17 (maybe +-1 zoom level) since the Bavaria model was trained on z17.

There’s a robosat channel on the osmus Slack in case you run into issues :)

RoboSat ❤️ Tanzania

The numbers in the tile files are slippy map tile x, y, z ids.

See the OSM wiki and the docs on rs cover:

The masks are generated based on the GeoJSON files you give it. It can happen that your rasterized masks and the aerial raster tiles you downloaded are not in sync, e.g. there could be more rasterized masks than you have downloaded raster imagery. In that case simply loop over both datasets and copy over the tiles for which you have both: a mask and an image.

RoboSat ❤️ Tanzania

Robosat v1.2 comes with batched extraction and batched rasterization. Before we had to keep all the features and images in memory during extraction and rasterization which used up quite a lot of memory on larger datasets. We now support batched extraction and batched rasterization flushing batches to disk every now and then. The batches are somewhat arbitrary and not based on e.g. smaller areas.

Batched extraction

Batched rasterization

And check the v1.2 release notes https://github.com/mapbox/robosat/releases/tag/v1.2.0

RoboSat ❤️ Tanzania

FYI there are also more recent diary posts; from the last two weeks:

for folks following along this old Robosat on Tanzania drone imagery diary post.

RoboSat ❤️ Tanzania

Hey the rs serve tool is mainly for debugging and quick development iteration cycles. You really should use the production ready rs predict tool for efficient batch prediction.

Then there are some limitations in rs serve

  • the zoom level is currently hard-coded here

  • it’s single threaded only and does not do batch prediction (inefficient at scale)

  • it does not handle tile borders like we do in rs predict so you might see artifacts at borders

  • even though the rs serve command takes host and port arguments, the map we serve to the browser right now assumes localhost:5000 for requesting tiles

You have to go in and adapt these manually right now. I’m also happy for pull requests and can help you along if you want to properly fix these issues and make rs serve more robust and user-friendly.

Hope that helps, Daniel

RoboSat v1.2.0 — state of the art losses, road extraction, batched extraction and rasterization

Follow up post running robosat v1.2 on all of Bavaria’s 80 cm aerial imagery is here

https://www.openstreetmap.org/user/daniel-j-h/diary/368771

Servus, Bayern: The robots are coming!

Absolutely! For Bavaria there is only 80 cm aerial imagery (openly) available. You can check it out in iD by selecting it as a background, e.g. go to

https://www.openstreetmap.org/edit#map=17/49.91130/10.89001

then go to Background (or press “b”) -> select “Bavaria (80 cm)”

Now pan around and see if you can distinguish what’s a building, what’s a small building, a shed, a car port, a parking lot, large cars. It can be quite hard even for humans.

I used robosat on drone imagery (see this diary) and on the high-resolution Mapbox aerial imagery in North America (where you could see people walking around) back when I was working for them. In addition to the resolution there are multiple tricks to get higher quality predictions out of it robosat trading off training time / runtime or requiring manual dataset curation.


The use-cases I see for prediction even on the 80 cm aerial imagery are

  • change detection over the years to see how our cities evolve over time

  • finding unmapped areas or computing a score of how “complete” our map is

  • as a pre-filter / priorization stage in tools like osmcha; if the robosat model roughly agrees with a changeset adding a building then we can let it go through; otherwise flag it for human inspection

RoboSat v1.2.0 — state of the art losses, road extraction, batched extraction and rasterization

I implemented polygonization for the parking use-case we had.

It’s implemented in the rs features tool and can be used after you got the model’s predictions (probabilities you can see above) and converted (potentially multiple - for ensembles) probabilities to masks with the rs masks tool.

Check out the robosat readme and the linked diary posts above - they go a bit more into detail how the pipeline works before and after the prediction stage.

Here is the robosat parking polygonization - building are more or less the same and in fact I’m using the parking handler as a building handler. It’s a bit ugly if you want to handle edge cases such as (potentially nested) (multi-)polygons but oh well.

The polygonization can definitely be improved by

Check out their work - it’s quite nice but they also run into edge cases and design trade-offs

Happy to guide you along if you want to work on this in robosat or have ideas.

RoboSat ❤️ Tanzania

Hey I just published a new release v1.2.0 - read about it here. The official docker images work now again, too. Here are the docs.

For zoom levels there is an open pull request: https://github.com/mapbox/robosat/pull/88

It should Just Work (tm) but I haven’t had the time to test it more thoroughly.

The problem there is we use some pixel-based thresholds and heuristics, and depending on your zoom level they will (slightly) change. The pull request implements these thresholds based on meters and no longer based on pixels. You can check out the code and help me test it by running it on your dataset, checking if results look reasonable, and playing around with the thresholds.

Ideally we’d have also a building handler (which right now would do the same as the parking lot handler). I just haven’t had the time to implement it properly and myself I can just quickly hack the code the way I need it.

Hope that helps.

RoboSat ❤️ Tanzania

Segmentation faults are tricky to debug: could be anything from a bad installation to version mismatches to us not handling an edge case in your dataset.

As a first step I recommend using the pre-built Docker binaries. The official ones are currently not getting built automatically - we’re on it to fix it.

In the meantime I just set up automated Docker image builds for my fork which I keep in sync with upstream for the time being. You can run them via

docker run -it --rm -v $PWD:/data --ipc=host danieljh/robosat:latest-cpu
docker run -it --rm -v $PWD:/data --ipc=host danieljh/robosat:latest-gpu

Note for folks coming across this in the future: check the official mapbox/robosat docker images and use them if they are again up to date instead of danieljh/robosat.

RoboSat ❤️ Tanzania

Multiple zoom levels work out of the box.

They get picked up automatically in the dataset loader if you put them all into the same directory. I would try e.g. with zoom level z and then z-1 and z+1 first. If you have a bigger difference in zoom levels your visual features in the images will be vastly different and it might make sense to build multiple models - one per zoom level - instead.

For the images and labels directory you will simply have multiple z sub-directories, as in:

images/
  19/x/y.png
  18//x/y.png
  17/x/y.png

Make sure the images and labels directory are in sync (for every image there is a label, and for every label there is an image) but otherwise that should be it.

I highly recommend training on GPUs. With CPUs you will have to wait a ridiculously long time. Also manually verify your images and labels correspond to each other.

RoboSat ❤️ Tanzania

If possible provide more building tiles data; the building IoU is

https://en.wikipedia.org/wiki/Jaccard_index

for the foreground class (buildings in your case) only.

  • Are you using 256x256 tiles or 512x512 tiles?
  • Are you using the Lovasz loss in the model config?
  • How long do you train?
RoboSat ❤️ Tanzania

Also what I’m seeing just now:

colors = ['denim', 'denim']

This doesn’t look right. You don’t want to give the background class and the foreground class the same color. Otherwise you will not be able to distinguish them visually.

RoboSat ❤️ Tanzania

Yeap that looks pretty bad; you definitely need more negative samples. I’m wondering why you only get it for some tiles, though? Here’s a ticket for the all-background mask:

https://github.com/mapbox/robosat/issues/43

Hope this helps.

RoboSat ❤️ Tanzania

Great! Keep me posted how it goes! :) Always happy to hear feedback.

RoboSat ❤️ Tanzania

WebP or PNG does not matter. We can read all image formats supported by PIL

https://pillow.readthedocs.io/en/5.3.x/handbook/image-file-formats.html

RoboSat ❤️ Tanzania

In your dataset

  • every image needs a corresponding mask
  • every mask needs a corresponding image

That is for all z, x, y tiles you are interested in there have to be parallel images

  • dataset/training/images/z/x/y.png
  • dataset/training/labels/z/x/y.png

The same applies to the validation dataset.

Creating this dataset is on you and a bit out of scope here.