OpenStreetMap

RoboSat ❤️ Tanzania

Posted by daniel-j-h on 5 July 2018 in English.

Recently at Mapbox we open sourced RoboSat our end-to-end pipeline for feature extraction from aerial and satellite imagery. In the following I will show you how to run the full RoboSat pipeline on your own imagery using drone imagery from the OpenAerialMap project in the area of Tanzania as an example.

Goal

For this step by step guide let’s extract buildings in the area around Dar es Salaam and Zanzibar. I encourage you to check out the amazing Zanzibar Mapping Initiative and OpenAerialMap for context around the drone imagery and where these projects are heading.

High-level steps

To extract buildings from drone imagery we need to run the RoboSat pipeline consisting of

  • data preparation: creating a dataset for training feature extraction models
  • training and modeling: segmentation models for feature extraction in images
  • post-processing: turning segmentation results into cleaned and simple geometries

I will first walk you through creating a dataset based on drone imagery available on OpenAerialMap and corresponding building masks bootstrapped from OpenStreetMap geometries. Then I will show you how to train the RoboSat segmentation model to spot buildings in new drone imagery. And in the last step I will show you how to use the trained model to predict simplified polygons for detected buildings not yet mapped in OpenStreetMap.

Data Preparation

The Zanzibar Mapping Initiative provides their drone imagery through OpenAerialMap.

Here is a map where you can manually navigate this imagery.

To train RoboSat’s segmentation model we need a dataset consisting of Slippy Map tiles for drone images and corresponding building masks. You can think of these masks are binary images which are zero where there is no building and one for building areas.

Let’s give it a try for Dar es Salaam and Zanzibar, fetching a bounding box to start with.

Let’s start with extracting building geometries from OpenStreetMap and figuring out where we need drone imagery for training dataset. To do this we need to cut out the area we are interested in from OpenStreetMap.

Our friends over at GeoFabrik provide convenient and up-to-date extracts we can work with. The osmium-tool then allows us to cut out the area we are interested in.

wget --limit-rate=1M http://download.geofabrik.de/africa/tanzania-latest.osm.pbf

osmium extract --bbox '38.9410400390625,-7.0545565715284955,39.70458984374999,-5.711646879515092' tanzania-latest.osm.pbf --output map.osm.pbf

Perfect, now we have a map.osm.pbf for Dar es Salaam and Zanzibar to extract building geometries from!

RoboSat comes with a tool rs extract to extract geometries from an OpenStreetMap base map.

rs extract --type building map.osm.pbf buildings.geojson

Now that we have a buildings.geojson with building geometries we need to generate all Slippy Map tiles which have buildings in them. For buildings zoom level 19 or 20 seems reasonable.

rs cover --zoom 20 buildings.geojson buildings.tiles

Based on the generated buildings.tiles file we then can

  • download drone imagery tiles from OpenAerialMap, and
  • rasterize the OpenStreetMap geometries into corresponding mask tiles

Here is a preview of what we want to generate and train the segmentation model on.

If you look closely you will notice the masks are not always perfect. Because we will train our model on thousands of images and masks, a slightly noisy dataset will still work fine.

The easiest way for us to create the drone image tiles is through the OpenAerialMap API. We can use its /meta endpoint to query all available drone images within a specific area.

http 'https://api.openaerialmap.org/meta?bbox=38.9410400390625,-7.0545565715284955,39.70458984374999,-5.711646879515092'

The response is a JSON array with metadata for all drone imagery within this bounding box. We can filter these responses with jq by their attributes, e.g. by acquisition date or by user name.

jq '.results[] | select(.user.name == "ZANZIBAR MAPPING INITIATIVE") | {user: .user.name, date: .acquisition_start, uuid: .uuid}'

Which will give us one JSON object per geo-referenced and stitched GeoTIFF image.

{
  "user": "ZANZIBAR MAPPING INITIATIVE",
  "date": "2017-06-07T00:00:00.000Z",
  "uuid": "https://oin-hotosm.s3.amazonaws.com/5ac7745591b5310010e0d49a/0/5ac7745591b5310010e0d49b.tif"
}

Now we have two options

  • download the GeoTIFFs and cut out the tiles where there are buildings, or
  • query the OpenAerialMap API’s Slippy Map endpoint for the tiles directly

We can tile the GeoTIFFs with a small tool on top of rasterio and rio-tiler. Or for the second option we can download the tiles directly from the OpenAerialMap Slippy Map endpoints (changing the uuids).

rs download https://tiles.openaerialmap.org/5ac626e091b5310010e0d480/0/5ac626e091b5310010e0d481/{z}/{x}/{y}.png building.tiles

Note: OpenAerialMap provides multiple Slippy Map endpoints, one for every GeoTIFF.

In both cases the result is the same: a Slippy Map directory with drone image tiles of size 256x256 (by default; you can run the pipeline with 512x512 images for some efficiency gains, too).

To create the corresponding masks we can use the extracted building geometries and the list of tiles they cover to rasterize image tiles.

rs rasterize --dataset dataset-building.toml --zoom 20 --size 256 buildings.geojson buildings.tiles masks

Before rasterizing we need to create a dataset-building.toml; have a look at the parking lot config RoboSat comes with and change the tile size to 256 and the classes to background and building (we only support binary models right now). Other configuration values are not needed right now and we will come back to it later.

With downloaded drone imagery and rasterized corresponding masks, our dataset is ready!

Training and modeling

The RoboSat segmentation model is a fully convolutional neural net which we will train on pairs of drone images and corresponding masks. To make sure these models can generalize to images never seen before we need to split our dataset into:

  • a training dataset on which we train the model on
  • a validation dataset on which we calculate metrics on after training
  • a hold-out evaluation dataset if you want to do hyper-parameter tuning

The recommended ratio is roughly 80/10/10 but feel free to change that slightly.

We can randomly shuffle our building.tiles, split it into three files according to our ratio, and use rs subset to split the Slippy Map directories.

rs subset images validation.tiles dataset/validation/images
rs subset masks validation.tiles dataset/validation/labels

Repeat for training and evaluation.

Before training the model we need to calculate the class distribution since background and building pixels are not evenly distributed in our images.

rs weights --dataset dataset-building.toml

Save the weights in the dataset configuration file, which training will then pick up. We can now adapt the model configuration file, e.g. enabling GPUs (CUDA) and then start training.

rs train --model model-unet.toml --dataset dataset-building.toml

For each epoch the training process saves the current model checkpoint and a history showing you the training and validation loss and metrics. We can pick the best model, saving its checkpoint, looking at the validation plots.

Using a saved checkpoint allows us to predict segmentation probabilities for every pixel in an image. These segmentation probabilities indicate how likely each pixel is background or building. We can then turn these probabilities into discrete segmentation masks.

rs predict --tile_size 256 --model model-unet.toml --dataset dataset-building.toml --checkpoint checkpoint-00038-of-00050.pth images segmentation-probabilities

rs masks segmentation-masks segmentation-probabilities

Note: both rs predict as well as rs mask transform Slippy Map directories and create .png files with a color palette attached for visual inspection.

These Slippy Map directories can be served via an HTTP server and then visualized directly in a map raster layer. We also provide an on-demand tile server with rs serve to do the segmentation on the fly; it’s neither efficient nor handles post-processig (tile boundaries, de-noising, vectorization, simplification) and should only be used for debugging purpose.

If you manually check the predictions you will probably notice

  • the segmentation masks already look okay’ish for buildings
  • there are false positives where we predict buildings but there is none

The false positives are due to how we created the dataset: we bootstrapped a dataset based on tiles with buildings in them. Even though these tiles have some background pixels they won’t contain enough background (so called negative samples) to properly learn what is not a building. If we never showed the model a single image of water it has a hard time classifying it as background.

There are two ways for us to approach this problem:

  • add many randomly sampled background tiles to the training set, re-compute class distribution weights, then train again, or
  • use the model we trained on the bootstrapped dataset and predict on tiles where we know there are no buildings; if the model tells us there is a building put these tiles into the dataset with an all-background mask, then train again

The second option is called “hard-negative mining” and allows us to come up with negative images which contribute most to the model learning about background tiles. We recommend this approach if you want a small, clean, and solid dataset and care about short training time.

For hard-negative mining we can randomly sample tiles which are not in building.tiles and predict on them with our trained model. Then make use of the rs compare tool to create images visualizing the images without buildings in them and the prediction next to it.

rs compare visualizations images segmentation-masks

After making sure these are really background images and not just unmapped buildings in OpenStreetMap, we can put the negative samples into our dataset with a corresponding all-background mask. Then run rs weights again, update the dataset config, and re-train.

It is common to do a couple rounds of hard-negative mining and re-training, resulting in a solid and small dataset which helps the model most for learning.

Congratulations, you now have a solid model ready for prediction!

Here are the segmentation probabilities I got out after spending a few hours of hard negative mining.

Interesting to see here is the model is not entirely sure about building construction sites. It is on us to make an explicit decision when creating the dataset and when doing hard-negative mining: do we want to include building construction sites or not.

These edge-cases occur with all features and make up the boundaries of your feature’s visual appearance. Are house-boats still buildings? Are parking lots without parking aisle lane markings still parking lots? Make a call and be consistent.

Finally, the post-processing steps are responsible for turning the segmentation masks into simplified and vectorized GeoJSON features potentially spanning multiple tiles. We also tools for de-duplicating detections against OpenStreetMap to filter out already mapped features.

I won’t go into post-processing details in this guide since the segmentation masks based on this small training dataset are still a bit rough to make it properly work well and the RoboSat post-processing is currently tuned to parking lots on zoom level 18 and I had to make some in-place adaptions when running this out.

Summary

In this step-by-step guide we walked through the RoboSat pipeline from creating a dataset, to training the segmentation model, to predicting buildings in drone imagery. All tools and datasets used in this guide are open source and openly available, respectively.

Give it a try! https://github.com/mapbox/robosat

I’m happy to hear your feedback, ideas and use-cases either here, on a ticket, or by mail.

Discussion

Comment from iandees on 5 July 2018 at 20:18

Great walkthrough! Thanks for writing it up.

Comment from PlaneMad on 6 July 2018 at 02:11

Can’t believe its so simple, thank you for the tutorial!

Comment from Tomas Straupis on 6 July 2018 at 03:23

Thank you for sharing! How many training images do you think are necessary to have a decently trained model? Are there any plans to release some trained models (f.e. buildings, roads) to the public?

Comment from daniel-j-h on 6 July 2018 at 07:58

The amount of images you will need for training can vary a lot and mostly depends on

  • the imagery quality, and if it’s from the same source or not
  • how good and precise the masks for training are
  • the zoom level
  • the variety in the images, and if it’s from the same area or totally different
  • the time and processing you can invest

For example the more hard-negative iterations you do the better the model can distinguish the background class. But hard-negative mining also takes quite a while. Same with the automatically created dataset: you can manually clean it up it but it is quite time-intensive.

In addition you could do more data augmentation during training to further artificially embiggen the dataset, you could do test-time augmentation where you predict on the tile and it’s 90/180/270 degree rotations and then merge the predictions, you could train and predict on multiple zoom levels, and so on.

I would say it also depends on your use-case. For detecting building footprints like in this guide a couple of thousand image are fine to get the rough shapes. It’s definitely not great for automatic mapping but that is not my intention in the first place.


Regarding trained models: I recently added an ONNX model exporter to RoboSat which allows for portable model files folks can use with their backend of choice. I could publish the trained ONNX model for this guide since I did it on my own time. The Mapbox models I am not allowed to publish as of writing this.

If there is community interest maybe we can come up with a publicly available model catalogue hosting ONNX models and metadata where folks can easily upload and download models?

Comment from Tomas Straupis on 6 July 2018 at 08:32

Thank you for the answers. I ran through 8000 training and 1000 validation images (no augmentation) and IoU is still below 0.8. Of course images are quite different: buildings in forest, in rural areas, industrial buildings, urban buildings. And I haven’t done any hard-negative training. Training this without GPU takes almost a week :-) So yes, it would be useful to have access to trained models at least to compare the results, for others it could be much easier to take a model, run predictions and add missing data to OSM without the need to train and tune the model.

Comment from daniel-j-h on 6 July 2018 at 10:19

IoU is still below 0.8

If you reach an IoU of 0.8 that’s pretty amazing to be honest. Here’s why. There are two sides contributing to the IoU metric: your predictions can be off but worse also the OpenStreetMap “ground truth” geometries can be off. Even with a perfect model you won’t reach an IoU of 1.0 since the OpenStreetMap geometries can - and often are - coarse, or have a slight misalignment, or are not yet mapped in OpenStreetMap etc.

Here’s an interesting experiment: randomly sample from your dataset. Manually annotate the image tiles generating fine-detailed masks. Now calculate the IoU metric on your manually generated fine-detailed masks aand the automatically generated masks from OpenStreetMap. This will be your IoU upper bound you can reach.

Also see this graphic to get a feel for the IoU metric.

Training this without GPU takes almost a week :-)

Agree, without a GPU training will be slow. That said, I made some changes recently which should speed things up considerably:

  • https://github.com/mapbox/robosat/pull/65 - Simplifies the training setup; now you no longer have to manually tune learning rate and more importantly epochs at which to decay the learning rate. This should give you great training results without any tuning.

  • https://github.com/mapbox/robosat/pull/46 - We are using an encoder-decoder architecture. Before we were training both the encoder as well as the decoder from scratch. This changeset brings in a pre-trained ResNet encoder, resulting in faster training times per epoch, less epochs needed, less memory needed, higher validation metrics, and faster prediction runtime.

If you want to give it a try with current master you should see improvements for your use-case.

Comment from imagico on 6 July 2018 at 10:50

What i find fascinating is that you seem to treat the image tiles completely independently - in other words: cut off building parts at a tile edge are treated as if they were whole buildings.

I can see that this affects the algorithm because we see the discontinuities in the results but I wonder how ‘local’ the method is ultimately when you apply the trained algorithm. I mean if you move the tile edge a tiny bit (a few pixel) would the results change completely across the whole tile potentially or would such a move only affect the results near the edge of the tile and leave the rest unaffected?

Comment from daniel-j-h on 6 July 2018 at 13:59

We train on (tile, mask) pairs, that’s right.

But for prediction we buffer the tile on the fly (e.g. with 32px overlap on all sides), predict on the buffered tile which now captures the context from its eight adjacent tiles, and then crop out the probabilities for the original tile again. This results in smooth borders across tiles in the masks. The probabilities you are seeing above might still not match 100% at the borders, but that’s fine.

Here is the original tile and how the buffering during prediction affects it.

original tile in the middle, four corners from adjacent tiles added for buffering

fully buffered tile with context from its eight adjacent tiles

Without this buffering approach you would clearly see the tile borders, correct.

Comment from imagico on 6 July 2018 at 14:14

Ah, that makes much more sense now.

Comment from shakasom on 6 July 2018 at 22:56

Thanks. can this be used in Jupyter Notebooks? Where are all of these commands(RS) run?

Comment from daniel-j-h on 7 July 2018 at 08:22

Not sure why you’d want to run the RoboSat toolchain in Jupyter Notebooks, but I guess there is nothing stopping you from doing that. You have to install the RoboSat tools and its dependencies then you can use the robosat Python package. The rs commands are really just a small shell script expanding to python3 -m robosat.tools cmd args.

Comment from Tomas Straupis on 7 July 2018 at 13:20

Thank you. ResNet decreases training time by 1/3!

Comment from maning on 8 July 2018 at 04:40

Following this guide, I got the following results. Hard negative mining helped a lot.

compare

Comment from QSMLong on 12 July 2018 at 03:17

Thank you for your detailed introduction to the RoboSat pipeline, but there’s still some confusing places for me. Could you please tell me if the online web site which you used to fetch a bounding box is available now? And May I take it that the bounding box has nothing to do with the final process once your bounding box covers the study areas?

Comment from daniel-j-h on 12 July 2018 at 09:47

It doesn’t matter where you get the bounding box from; I used http://geojson.io since it’s convenient to use. And yes, it is currently working for me.

The bounding box is only used for cutting out a smaller base map from a larger .osm.pbf and downloading aerial images in that area. After that we work with data extracted from the base map and the aerial imagery as is.

Comment from NewSource on 13 July 2018 at 15:38

Thanks ! Some steps are still confusing me when I try to use my own drone imagery. I tried to prepare tiles from my own GeoTiff (.tif) that covers whole area I want to work on, with tiler.py, as result I got right Slippy Map directory structure that contains garbage images. What I did wrong ?

Comment from daniel-j-h on 16 July 2018 at 09:40

I can’t debug statements like “contains garbage images”. What does your GeoTIFF look like? Do you have a small self-contained example where I can reproduce this issue? What do the Slippy Map tiles look like? You can give gdal2tiles a try; the tiler script was just a quick way for me to tile my GeoTIFFs.

Comment from QSMLong on 18 July 2018 at 01:30

Sorry to trouble you again. Similarly I tried to prepare tiles from Tiff Image downloaded at OpenAerialMap. The tiling speed was about 60tile/s, and the speed would go down after it operate several hours. it would take a lot of time to tile my study area. I’d like to know your tiling speed, and is there any lib which could accelerate the tiling speed or I must write my own code.

Comment from daniel-j-h on 18 July 2018 at 08:36

Agree, for larger tiling jobs I recommend using proper tools like gdal2tiles.

Comment from QSMLong on 20 July 2018 at 02:16

When I tried to create the drone image tiles the OpenAerialMap API, the robosat.tool download just returned too many “Warning: Tile(x=857247, y=430855, z=20) failed, skipping”. After I debuged the program, it showed “{“message”:”Not Authorized - Invalid Token”}”. But I did copy the “tms” property from the “http ‘https://api.openaerialmap.org/meta?bbox=114.159102840124,30.3833619895642,114.637039919757,30.6950619601414’” request. And I choosed “
{

        "__v": 0, 

        "_id": "59e62c273d6412ef7220d589", 

        "acquisition_end": "2015-08-23T00:00:00.000Z", 

        "acquisition_start": "2015-08-22T00:00:00.000Z", 

        "bbox": [

            112.64687500000001, 

            29.229752777777776, 

            115.05728611111111, 

            31.365344444444446

        ], 

        "contact": "hello@astrodigital.com", 

        "file_size": 174569977, 

        "footprint": "POLYGON((112.64687500000001 31.365344444444446,115.05728611111111 31.365344444444446,115.05728611111111 29.229752777777776,112.64687500000001 29.229752777777776,112.64687500000001 31.365344444444446))", 

        "geojson": {

            "bbox": [

                112.64687500000001, 

                29.229752777777776, 

                115.05728611111111, 

                31.365344444444446

            ], 

            "coordinates": [

                [

                    [

                        112.64687500000001, 

                        31.365344444444446

                    ], 

                    [

                        115.05728611111111, 

                        31.365344444444446

                    ], 

                    [

                        115.05728611111111, 

                        29.229752777777776

                    ], 

                    [

                        112.64687500000001, 

                        29.229752777777776

                    ], 

                    [

                        112.64687500000001, 

                        31.365344444444446

                    ]

                ]

            ], 

            "type": "Polygon"

        }, 

        "gsd": 35.64656047021406, 

        "meta_uri": "http://oin-astrodigital.s3.amazonaws.com/LC81230392015234LGN00_bands_432.TIF_meta.json", 

        "platform": "satellite", 

        "projection": "PROJCS[\"WGS84/Pseudo-Mercator\",GEOGCS[\"WGS84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0],UNIT[\"degree\",0.0174532925199433],AUTHORITY[\"EPSG\",\"4326\"]],PROJECTION[\"Mercator_1SP\"],PARAMETER[\"central_meridian\",0],PARAMETER[\"scale_factor\",1],PARAMETER[\"false_easting\",0],PARAMETER[\"false_northing\",0],UNIT[\"metre\",1,AUTHORITY[\"EPSG\",\"9001\"]],EXTENSION[\"PROJ4\",\"+proj=merc+a=6378137+b=6378137+lat_ts=0.0+lon_0=0.0+x_0=0.0+y_0=0+k=1.0+units=m+nadgrids=@null+wktext+no_defs\"],AUTHORITY[\"EPSG\",\"3857\"]]", 

        "properties": {
            "sensor": "Landsat 8", 
            "thumbnail": "http://earthexplorer.usgs.gov/browse/landsat_8/2015/123/039/LC81230392015234LGN00.jpg",
 
            "tms": "http://a.tiles.mapbox.com/v4/astrodigital.LC81230392015234LGN00_bands_432/{z}/{x}/{y}.png?access_token=pk.eyJ1IjoiYXN0cm9kaWdpdGFsIiwiYSI6ImNVb1B0ZkEifQ.IrJoULY2VMSBNFqHLrFYew"

        }, 

        "provider": "Astro Digital", 

        "title": "LC81230392015234LGN00_bands_432.TIF", 

        "uuid": "http://oin-astrodigital.s3.amazonaws.com/LC81230392015234LGN00_bands_432.TIF"
    }, "

to download the tiles. Similarly, I tried to repeat your work in the diary. Sadly I didn’t get the tiles either. Did my network have some problem or other reasons?

Comment from maning on 20 July 2018 at 02:25

http://oin-astrodigital.s3.amazonaws.com/LC81230392015234LGN00_bands_432.TIF

It looks like you are downloading landsat imagery, is this what you want to download? Depending on the zoom level you download, this is expected to be low resolution (28m pixel resolution).

Comment from QSMLong on 20 July 2018 at 02:38

Yes, it’s what I want now. At present, I just want to repeat the experiment. (By the way, are there any easy-to-use high resolutuon remote-sensing image?)

Comment from daniel-j-h on 20 July 2018 at 11:45

You will get the warning when the downloader was not able to download tiles from the list of tile coordinates you give it. This is probably due to the tile endpoint not providing imagery for all your tile ids.

https://wiki.openstreetmap.org/wiki/Aerial_imagery is a good place to start when looking for imagery sources.

Comment from NewSource on 24 July 2018 at 17:34

Need some help with results visualization

I try to execute rs serve with parameters --tile_size 256 --model config/model-unet.toml --dataset config/dataset-parking.toml --checkpoint /pth/checkpoint-00003-of-00005.pth --url https://api.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}

Comment from daniel-j-h on 25 July 2018 at 10:12

Your --url argument is an URL with an access token, right? Like

--url https://api.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}@2x.webp?access_token=TOKEN

The serve tool is currently fixed at zoom level 18 since we only used it for initial debugging. If you are not using z18 you have to change it here:

https://github.com/mapbox/robosat/blob/6c7f547d2dd303f8a68b4fbd2ba60583348d7816/robosat/tools/serve.py#L53-L55

You will definitely see many 404s since the serve tool tileserver will only respond with e.g. z18 tiles but the map will request e.g. z16, z17, z18, z19, all at the same time.

In addition you probably want to adapt the map template’s initial location, zoom levels, and so on:

https://github.com/mapbox/robosat/blob/6c7f547d2dd303f8a68b4fbd2ba60583348d7816/robosat/tools/templates/map.html#L46-L64

I recommend using the rs predict tool for proper batch-prediction. The rs serve tool is really just a simple debugging tile-server and needs a bit of getting into the code.

Comment from NewSource on 25 July 2018 at 11:41

Thanks Daniel for your quick response !

  • I use same URL argument like you mentioned , the only difference I configured env variable MAPBOX_ACCESS_TOKEN=my-public-api-key because of error Error: map token needed visualizing results; export MAPBOX_ACCESS_TOKEN My full URL looks like: ./rs serve --tile_size 256 --model config/model-unet.toml --dataset config/dataset-buildings-hamburg.toml --checkpoint pth/checkpoint-00003-of-00005.pth --url https://api.mapbox.com/v4/mapbox.satellite/{z}/{x}y}.@2x.webp?access_token=MAPBOX_ACCESS_TOKEN

  • I adjusted also minZoom: 16, maxZoom: 18 in both cases here: https://github.com/mapbox/robosat/blob/6c7f547d2dd303f8a68b4fbd2ba60583348d7816/robosat/tools/templates/map.html#L46-L64

  • Could you please add about map template’s initial location ? In case I work with Hamburg City Data , should I change the any default parameter ?

  • Unfortunately my main problem still exists : I get one image on both sides of a screen divided by vertical line , no predictions on the right side (masks) Link to screen capture: https://ibb.co/eRd3zT

Comment from daniel-j-h on 26 July 2018 at 09:49

You will need two URLs with tokens. One for the compare map to load styles and one for the tile server to fetch satellite or aerial imagery from to run segmentation on it.

You will need to adapt the center locations for the compare map (for both the before and after map) here. The after map is then adding a raster layer automatically requesting tiles from the rs serve tileserver here.

Again, I recommend using the rs predict batch prediction tool.

Comment from mgd82 on 12 August 2018 at 14:43

First off - thanks so much this is awesome.

Second - I’m trying to just replicate the work shown in this post and am stuck at the rs download https://tiles.openaerialmap.org/5ac626e091b5310010e0d480/0/5ac626e091b5310010e0d481/{z}/{x}/{y}.png building.tiles step. I’ve run this command which results in a ton of Failed, Skipped warnings. In a previous comment you said there would be some failures but when i check the resulting folder generated it is less than a MB in size and seems to only contain empty folders.

I did not execute the steps prior http 'https://api.openaerialmap.org/meta?bbox=38.9410400390625,-7.0545565715284955,39.70458984374999,-5.711646879515092'

jq '.results[] | select(.user.name == "ZANZIBAR MAPPING INITIATIVE") | {user: .user.name, date: .acquisition_start, uuid: .uuid}' as the post seems to suggest that is an alternative way? But I’m not clear on that.

Also not sure what steps i might have missed in regard to this:

 *We can tile the GeoTIFFs with a small tool on top of rasterio and rio-tiler. Or for the second 
 option we can download the tiles directly from the OpenAerialMap Slippy Map endpoints 
(changing the uuids).*

Would appreciate any clarification you could provide - I’m a newbie at some of this so apologies if I’m just missing something that should be very straightforward.

Comment from daniel-j-h on 22 August 2018 at 17:11

The problem with using rs download against the OpenAerialMap endpoint is they provide a different endpoint for each underlying GeoTIFF. This means you need to rs download from many endpoints and you will see lots of errors.

The alternative is to download the raw GeoTIFFs to your machine and run either gdal2tiles or the small tiler tool I wrote here to cut out tiles from the GeoTIFFs.

You will find the GeoTIFF URLs in the OpenAerialMap API response (see the description above using the /meta endpoint and the jq transformations).

Hope that helps

Comment from c0ldfashioned on 23 August 2018 at 00:18

I’m having a devil of a time installing osmium-tool, because I’m having a devil of a time installing protozero. I’m still somewhat of a noob when it comes to ubuntu and installing packages, but I’ve made it this far. I really want to try to follow this tutorial, and move on to working on a project involving Flint, Michigan & parking lots. Can you recommend a course of action for installing protozero? Do I really need osmium-tool to use robosat? I feel like I’m thiiiissssss close to getting this project off the ground, but keep getting stumped by dependencies…

Comment from daniel-j-h on 23 August 2018 at 07:25

I think osmium-tool comes with documentation for how to build it. For protozero and libosmium, I would just compile them from source, too, and not rely on the package manager. If you are running into compilation issues I would open a ticket with these libraries.

You don’t strictly need osmium-tool I just used it because I had it around. You can use any tool to cut out a smaller area out of a larger pbf base map.

Comment from c0ldfashioned on 23 August 2018 at 14:15

Thanks for the quick response! In the short term, I’ll try another tool to see if I can get off the ground. After that, I may try to come back and get osmium-tool to work. I like the neat command line aspect of it. Cheers, Charles

Comment from c0ldfashioned on 28 August 2018 at 23:44

Well, I was about to clip the map using osmconvert, then ran into the issue with robosat where you have to uninstall and reinstall osmium to work with python 3.6. Ran rs extract again and got 38-40 Warning: Invalid feature: links (which seem to open maps with building polygons on them). After that, it bombed out with a traceback ending with osmium._osmium.InvalidLocationError: Invalid location. I’m bummed. I made a leap forward this weekend (tough finding time with an 18-month-old in the house), but hit another wall. Any help you can offer would be appreciated. I’m sure part of my difficulties lie in not being able to devote a lot of time to learning all the tools. I started working on this back when this blog post first came out.

Comment from daniel-j-h on 29 August 2018 at 09:01

If you don’t want to install robosat on your own, we provide both CPU as well as GPU images you can docker-run directly: https://hub.docker.com/r/mapbox/robosat/

Regarding your problem cutting out an area from an osm base map: it sounds like either the cutting is creating invalid polygons or there are invalid polygons in the base map. In addition it sounds like cutting out an area does not properly keep the way locations that’s why you are running into the location problem

osmium-tool’s extract comes with sane default strategies to prevent all of these issues.

Wnat you can try if you are having a hard time installing osmium-tool is to use the mason package manager to get a pre-built binary:

mkdir ./mason
curl -sSfL https://github.com/mapbox/mason/archive/v0.17.0.tar.gz \
    | tar --gunzip --extract --strip-components=1 --exclude="*md" --exclude="test*" --directory=./mason

./mason/mason install osmium-tool 1.7.1
export PATH="$(./mason/mason prefix osmium-tool 1.7.1)/bin:$PATH"

osmium extract \
  --bbox -84.6673,33.5219,-84.0427,34.0829 \
  -o atlanta.osm.pbf \ 
  georgia-latest.osm.pbf

Comment from c0ldfashioned on 27 September 2018 at 00:27

Hi, I’m back. I think I sorted out my issues with osmium and all of that, but now I’m going back and I’m stuck at the following section (is it even necessary)?

jq ‘.results[] select(.user.name == “ZANZIBAR MAPPING INITIATIVE”) {user: .user.name, date: .acquisition_start, uuid: .uuid}’

When I enter that command after entering the previous http command (using httpie), my terminal just hangs and doesn’t appear to be doing anything. Is this normal behavior? i.e. is it actually working, but the output takes a really long time? What am I missing here?

Comment from daniel-j-h on 2 October 2018 at 16:25

You need to run the jq command either on a json file or pipe the output from the http query directly into it. Please see the jq documentation and quick start.

Comment from c0ldfashioned on 2 October 2018 at 17:07

Ah - thank you. I had a feeling that was where the disconnect was. So, in your example, .results (or .results[]) is the file you created with the JSON output from http? If piping directly into jq, do I use .results[] or simply .[]?

Comment from daniel-j-h on 2 October 2018 at 17:24

.results is the jq filter already. You can pipe the result of the http query into a file and then do jq '..' file or do http .. | jq '..'.

Check https://stedolan.github.io/jq/manual/

Comment from tonytonyissaissa on 23 October 2018 at 16:05

Hi Daniel, I am trying to reproduce your excellent tutorial. I arrived to the step where I have to tile the GeoTiffs. I decided to use your tiler.py in the following way: I collected the GeoTIFF URLs in a file called ‘raw_geotif.txt’ (83 lines like this): https://oin-hotosm.s3.amazonaws.com/5ac7745591b5310010e0d49a/0/5ac7745591b5310010e0d49b.tif

Then I invoked this script: #!/bin/sh for i in cat raw_geotif.txt; do python3 tiler.py –zoom 20 $i slippyMap_dir/; done

In the directory ‘slippyMap_dir’ I obtained tiles.

After that I obtained the masks using the command: rs rasterize –dataset dataset-building.toml –zoom 20 –size 256 buildings.geojson buildings.tiles masks

Here is my question:

Isn’t strange to have big differences in terms of tiles between ‘slippyMap_dir’ and ‘masks’ ? Indeed, the number of tiles in ‘slippyMap_dir’ is too small compared to ‘masks’. The latter contains exactly the number of tiles in the file buildings.tiles ( which is in my case 430011 tiles). While ‘slippyMap_dir’ contains 127448 tiles.

What could be the problem ? It seems that tiler.py does not download all the tiles in file buildings.tiles.

I appreciate any help.

Thanks in advance

Comment from daniel-j-h on 23 October 2018 at 17:26

Maybe visualize where you have GeoTIFF image tiles and where you have mask tiles. It could be that the GeoTIFFs just don’t cover all of the areas you extracted masks for.

Otherwise try to reproduce this with a small GeoTIFF and/or a smaller area.

And maybe try the gdal2tiles approach and see if the output is different.

You need to debug this a bit - could be multiple problems.

Comment from AHMAD FAIRUZ ALI on 5 November 2018 at 23:47

Hi Daniel,

Can you explain how to perform training split on 80:10:10 ? Do I have to make changes on the files building.tiles or validation.tiles?

Thanks in advance.

Comment from AHMAD FAIRUZ ALI on 6 November 2018 at 02:06

Hi Daniel,

I got the answer already on how to split the building.tiles. Basically, we need to split this building.tiles files into 3 (training, validation,evaluation) using any linux script for example sed, awk in bash script.

thanks.

Comment from AHMAD FAIRUZ ALI on 6 November 2018 at 02:07

Hi Daniel,

I have another questions. I got an error while running the ‘rs training’. Below are the errors.

File “/robosat-master/robosat/datasets.py”, line 58, in init assert len(self.target) == len(self.inputs[0]), “same number of tiles in images and label” AssertionError: same number of tiles in images and label

What does it tell?

Thanks,

Comment from AHMAD FAIRUZ ALI on 7 November 2018 at 01:38

Hi Jesus Benavent,

The file dataset-building.toml is already there under directory,

robosat-master/config/dataset-building.toml

You can edit the file accordingly.

Fairuz,

Comment from AHMAD FAIRUZ ALI on 8 November 2018 at 01:09

Sure Jesus.

I believe the problem was images and labels are not in sync. Try run the following commands.

$find project/zanzibar/dataset/ -type f cut -d/ -f-5 uniq -c

where change ‘project/zanzibar/dataset/’ into your path where ‘dataset’ folder is located.

Mine return the following.

  1     project/zanzibar/dataset/evaluation/images   43080 project/zanzibar/dataset/evaluation/labels
  6     project/zanzibar/dataset/training/images  344584project/zanzibar/dataset/training/labels
  1     project/zanzibar/dataset/validation/images   43074 project/zanzibar/dataset/validation/labels

Obviously, my images and labels are not in sync. The problem i found is that, I have the correct building.tiles which I want to focus to, but I couldn’t download or find the correct GEOTIFFS for the respective building.tiles ( i believe so). I wanted to find the area of Putrajaya, Malaysia.

So, I’m stuck doing this for the last 3 days. I’m clueless.

Thank you, Fairuz,

Comment from daniel-j-h on 8 November 2018 at 10:35

In your dataset

  • every image needs a corresponding mask
  • every mask needs a corresponding image

That is for all z, x, y tiles you are interested in there have to be parallel images

  • dataset/training/images/z/x/y.png
  • dataset/training/labels/z/x/y.png

The same applies to the validation dataset.

Creating this dataset is on you and a bit out of scope here.

Comment from daniel-j-h on 8 November 2018 at 11:06

WebP or PNG does not matter. We can read all image formats supported by PIL

https://pillow.readthedocs.io/en/5.3.x/handbook/image-file-formats.html

Comment from daniel-j-h on 8 November 2018 at 12:44

Great! Keep me posted how it goes! :) Always happy to hear feedback.

Comment from tonytonyissaissa on 23 November 2018 at 12:33

Hi Daniel, I have a problem in producing an all-background mask for the hard negative mining step. What I did is to use rs like this: ./rs rasterize –dataset config/dataset-neg-building.toml (etc)…………….. and the file dataset-neg-building.toml:

Human representation for classes.

classes = [‘background’, ‘background’]

# Color map for visualization and representing classes in masks. # Note: available colors can be found in robosat/colors.py colors = [‘denim’, ‘denim’]

But the problem is that the model still produces many false positives. I think that my solution to produce all-background mask is false. Do you have some clues to solve this issue ?

Thanks in advance, Tony

Comment from tonytonyissaissa on 23 November 2018 at 13:25

I upload a result of my prediction : "Commercial Photography"

You can see a big area of background predicted as building

Comment from daniel-j-h on 23 November 2018 at 15:05

Yeap that looks pretty bad; you definitely need more negative samples. I’m wondering why you only get it for some tiles, though? Here’s a ticket for the all-background mask:

https://github.com/mapbox/robosat/issues/43

Hope this helps.

Comment from daniel-j-h on 23 November 2018 at 15:06

Also what I’m seeing just now:

colors = ['denim', 'denim']

This doesn’t look right. You don’t want to give the background class and the foreground class the same color. Otherwise you will not be able to distinguish them visually.

Comment from tonytonyissaissa on 23 November 2018 at 16:51

Hi again, I solved the problem. I had mistakenly used a wrong “.geojson” file argument of “./rs rasterize”. After I used the correct file, I used the .toml file correctly also (with two classes: background and building). Now the results are very good.

Comment from tonytonyissaissa on 23 November 2018 at 18:19

Also one question: I obtained a big difference between mIou (=0.723) and building Iou (=0.476). Is this normal ? In my final training set I used “9765” building tiles and “26751” non-building tiles. That is non-building tiles are ~ 2,5 order of magnitude of building tiles. The distribution of weights I got: Weights : [1.463278, 17.328699] The final result looks like: "Commercial Photography"

Comment from daniel-j-h on 27 November 2018 at 19:00

If possible provide more building tiles data; the building IoU is

https://en.wikipedia.org/wiki/Jaccard_index

for the foreground class (buildings in your case) only.

  • Are you using 256x256 tiles or 512x512 tiles?
  • Are you using the Lovasz loss in the model config?
  • How long do you train?

Comment from tonytonyissaissa on 28 November 2018 at 16:12

I used 256x256 tiles and the Lovasz loss. I trained the model for 50 epochs. The maximum that I got is at epoch 34: validation mIoU: 0.731, building IoU: 0.493. prediction example is below. "Commercial Photography"

Comment from Arjun pmm on 17 January 2019 at 02:23

At the rs download step, I get the following output:

Warning: Tile(x=638695, y=544460, z=20) failed, skipping Warning: Tile(x=638594, y=544009, z=20) failed, skipping

I followed the exact steps as mentioned in the tutorial. I saved the results obtained from openaerialmap.org into a separate json file and now I’m iterating the json file to extract the uuids and run the rs download command.

I even tried the alternative method, i.e downloaded the geotiff images manually using the URLs obtained from openaerialmap.org and used tiler.py script to tile the images, but the images and masks do not match if I follow this step.

Anyone who has followed this tutorial and got the data preparation phase working, can you please help?

Comment from 0x134146f on 2 April 2019 at 19:42

Arjun pmm, It seems I’ve got the same problem. I also tried rs download and tiler.py on geotiff.

$ cat buildings.tiles | wc -l
435917

$ find masks/20 -type f | wc -l
435917

$ find images/20 -type f | wc -l
469556
; images directory is the result of the tiler.py script

$ comm -12 <(cd masks ; find 20/ -type f | cut -f 1 -d '.' | sort ) <(cd images ; find 20/ -type f | cut -f 1 -d '.' | sort) | wc -l
5510

This means that:
only 5510 images have a corresponding mask;
and
only 5510 mask have a corresponding image;

How many GeoTIFF files did you download from openaerialmap.org (https://oin-hotosm.s3.amazonaws.com/....tif)?

Comment from Adrien_L on 1 May 2019 at 17:46

Hi Daniel,

Thank you for sharing this, kudos for the great instructions. I had a question regarding training with a different zoom level. I’ve been using a qgis add-in called QTiles to create my tiles on in-house imagery. I have not been successful getting the correct rasterio version to use your tiler.py unfortunately.

I’ve been successful running trainings and doing predictions at zoom level 20 on my CPU, however I have been running into issues at zoom level 21. In a nutshell, the MCC for the training and validation epochs go to Infinity (“Inf”). If I understand correctly, √ (TP +FP) * (TP + FN) * (TN + FP) * (TN + FN) must be equal to zero. I believe the issue lies in the zoom level but I am not 100% sure how to interpret the lack of conversion. I’ve change different parameters such as the the way I create the slippy tile, the way the “y” folders are split between training/validation/evaluation (randomized vs linear cut at the end) and the zoom level is what I suspect is causing issues. Initially I suspected blank tiles but now I am not so sure. I was wondering what your experience was with different levels (I read the manilla diary, in which he successfully runs it on zoom_level = 21).

Do you know what could be causing the lack of convergence for my MCC? I killed the training runs after a couple of epochs, could the lack of convergence be due to my lack of patience and further epochs could converge?

Thanks for all the work on this!

Comment from daniel-j-h on 11 May 2019 at 11:48

Multiple zoom levels work out of the box.

They get picked up automatically in the dataset loader if you put them all into the same directory. I would try e.g. with zoom level z and then z-1 and z+1 first. If you have a bigger difference in zoom levels your visual features in the images will be vastly different and it might make sense to build multiple models - one per zoom level - instead.

For the images and labels directory you will simply have multiple z sub-directories, as in:

images/
  19/x/y.png
  18//x/y.png
  17/x/y.png

Make sure the images and labels directory are in sync (for every image there is a label, and for every label there is an image) but otherwise that should be it.

I highly recommend training on GPUs. With CPUs you will have to wait a ridiculously long time. Also manually verify your images and labels correspond to each other.

Comment from Adrien_L on 13 May 2019 at 16:10

Hi Daniel,

Thank you for the response! I am going to try multiple zoom levels to see if I can get that to work. My initial concern was running multiple at different zoom levels, zoom_level 21 (on it’s own) wasn’t training (MCC going to Inf) but zoom_level 20 (on it’s own) worked well. It is also possible the resolution of my image simply doesn’t get to the resolution equivalent for z21; I couldn’t make out why z21 wasn’t working giving me similar results when training and if there was a possible explanation.

I agree that training on the CPU is time consuming, I will try to get my hand on a GPU in the future. Thanks again for sharing this awesome project!

Comment from Orenshef on 29 May 2019 at 12:28

Hi Daniel, amazing job and thank you for this article. I am currently trying to apply this code on my own pictures. steps i did is a s follows: 1. tile my geoTiff using GDAL2tiles 2. obtain a geojson file 3. use “rs cover” to create buildings.tiles (this was a mess, had to amend my geojson a lot for that to work) 4. run “rs rasterize”

I am stuck here. I am receiving “Segmentation fault (core dumped)” from the command: ./rs rasterize –dataset config/dataset-building.toml –zoom 20 –size 256 file.geojson buildings.tiles masks

Comment from Orenshef on 29 May 2019 at 12:49

Update, this comes from the rs method. the “cover” stage i have done manually using an adjusted script to my geojson, not using rs. So rs gives me a Segmentation fault. Any idea why?

Comment from daniel-j-h on 29 May 2019 at 18:12

Segmentation faults are tricky to debug: could be anything from a bad installation to version mismatches to us not handling an edge case in your dataset.

As a first step I recommend using the pre-built Docker binaries. The official ones are currently not getting built automatically - we’re on it to fix it.

In the meantime I just set up automated Docker image builds for my fork which I keep in sync with upstream for the time being. You can run them via

docker run -it --rm -v $PWD:/data --ipc=host danieljh/robosat:latest-cpu
docker run -it --rm -v $PWD:/data --ipc=host danieljh/robosat:latest-gpu

Note for folks coming across this in the future: check the official mapbox/robosat docker images and use them if they are again up to date instead of danieljh/robosat.

Comment from Orenshef on 3 June 2019 at 08:53

Thank you, this helped a lot. I am now trying to manage the post-processing stage. I am doing this over buildings (my own dataset) and in zoom 20. Is it still true that the post-processing stages are adjusted only to z=20 and for the parking data? Are the adjustments needed something I could manage to do by working out the (features, merge, dedupe) code or is it not recommended to dive into it at this point?

Comment from daniel-j-h on 3 June 2019 at 18:53

Hey I just published a new release v1.2.0 - read about it here. The official docker images work now again, too. Here are the docs.

For zoom levels there is an open pull request: https://github.com/mapbox/robosat/pull/88

It should Just Work (tm) but I haven’t had the time to test it more thoroughly.

The problem there is we use some pixel-based thresholds and heuristics, and depending on your zoom level they will (slightly) change. The pull request implements these thresholds based on meters and no longer based on pixels. You can check out the code and help me test it by running it on your dataset, checking if results look reasonable, and playing around with the thresholds.

Ideally we’d have also a building handler (which right now would do the same as the parking lot handler). I just haven’t had the time to implement it properly and myself I can just quickly hack the code the way I need it.

Hope that helps.

Comment from kayoD on 10 June 2019 at 10:20

Hello Daniel, thanks for the great work. I have problem reproducing what have been done here. I am able to use the rs serve command as it gives me a blank webpage with a 202 https status. This is how I use the command.

 
  
rs serve --model /app/container_mount/model-unet.toml 
--dataset /app/container_mount/dataset-building.toml 
--checkpoint /app/container_mount/checkpoints/checkpoint-00023-of-0025.pth 
--tile_size 256 --host 0.0.0.0 
--url https://api.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}@2x.webp?access_token=MyToken



Comment from kayoD on 10 June 2019 at 10:25

Hello Daniel, thanks for the great work. I have problem reproducing what have been done here. I am not able to use the rs serve command as it gives me a blank webpage with a 202 https status. This is how I use the command.

 
  
rs serve --model /app/container_mount/model-unet.toml 
--dataset /app/container_mount/dataset-building.toml 
--checkpoint /app/container_mount/checkpoints/checkpoint-00023-of-0025.pth 
--tile_size 256 --host 0.0.0.0 
--url https://api.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}@2x.webp?access_token=MyToken


Note: I have used the rs predict command but I also want to use the rs serve as well. Sorry for the repetition :(

Comment from daniel-j-h on 10 June 2019 at 18:45

Hey the rs serve tool is mainly for debugging and quick development iteration cycles. You really should use the production ready rs predict tool for efficient batch prediction.

Then there are some limitations in rs serve

  • the zoom level is currently hard-coded here

  • it’s single threaded only and does not do batch prediction (inefficient at scale)

  • it does not handle tile borders like we do in rs predict so you might see artifacts at borders

  • even though the rs serve command takes host and port arguments, the map we serve to the browser right now assumes localhost:5000 for requesting tiles

You have to go in and adapt these manually right now. I’m also happy for pull requests and can help you along if you want to properly fix these issues and make rs serve more robust and user-friendly.

Hope that helps, Daniel

Comment from daniel-j-h on 10 June 2019 at 18:47

FYI there are also more recent diary posts; from the last two weeks:

for folks following along this old Robosat on Tanzania drone imagery diary post.

Comment from Mehran66 on 13 June 2019 at 17:09

When I run the extract using docker, 16 geojson files with names like “buildings-4ef7bde13d8e4528b48c577a2d269e1c.geojson” are generated. I am running on the dataset used in this post. It seems that the area is devided into smaller area. Each geojson has a size of about 20 MB.

Comment from daniel-j-h on 13 June 2019 at 18:55

Robosat v1.2 comes with batched extraction and batched rasterization. Before we had to keep all the features and images in memory during extraction and rasterization which used up quite a lot of memory on larger datasets. We now support batched extraction and batched rasterization flushing batches to disk every now and then. The batches are somewhat arbitrary and not based on e.g. smaller areas.

Batched extraction

Batched rasterization

And check the v1.2 release notes https://github.com/mapbox/robosat/releases/tag/v1.2.0

Comment from Mehran66 on 17 June 2019 at 23:58

I generate the masks using the following command: rs rasterize –dataset dataset-building.toml –zoom 20 –size 256 buildings.geojson buildings.tiles masks

and I get this: 18529/18529 [00:13<00:00, 1349.35feature/s]100% 10645/10645 [02:00<00:00, 88.30tile/s]

a mask folder is generated with 10645 images that is equal to the number of rows in the buildings.tiles. My tile folder includes 3,080 images. My original image only covers a subset of buildings in the GeoJson file. All of the images in the mask folder are fine with images classified into building and background. The problem is that I cannot find any of the tile images in the mask folder. For example I have:

C:\Users\Mehran\Desktop\test\tiles\20\606678\478661.png

But, I cannot find the corresponding image in the mask folder.

Here is the only image in the masks\20\600678 folder: C:\Users\Mehran\Desktop\test\masks\20\600678\570023.png

My other question is that what are the numbers in the buildings.tiles file: 606705,569916,20 606526,569974,20 606514,569911,20 600314,569998,20

Is the projection and coordinate system of the files matters. My tile images are EPSG:3857 (I used gdal2tiles to make the tiles) and my GeoJson is EPSG:4326.

Comment from Mehran66 on 18 June 2019 at 14:11

I clipped the GeoJson to only includes the polygons overlapping with the imagery. Now the folders in the tiles and masks folder are identical, but the numbers assigned for the tile images are different from numbers assigned for the mask images. For example, 478657 and 569904 are for two corresponding images. Also 1967 mask images are generated for 3080 tile images.

Comment from Mehran66 on 18 June 2019 at 15:48

I got another clue. In the dataset-building.toml I have: dataset = ‘C:/Users/Mehran/Desktop/test/tiles’

But it seems that this address is ignored. Even when I delete this line, the code runs fine and generates the masks.

Comment from daniel-j-h on 18 June 2019 at 17:29

The numbers in the tile files are slippy map tile x, y, z ids.

See the OSM wiki and the docs on rs cover:

The masks are generated based on the GeoJSON files you give it. It can happen that your rasterized masks and the aerial raster tiles you downloaded are not in sync, e.g. there could be more rasterized masks than you have downloaded raster imagery. In that case simply loop over both datasets and copy over the tiles for which you have both: a mask and an image.

Comment from Mehran66 on 18 June 2019 at 22:02

sorry for the confusions. So my problem is that the labels assigned to the image tiles are different from the labels of mask images. For example:

for image: C:\Users\Mehran\Desktop\test\tiles\20\606678\478661.png for mask: C:\Users\Mehran\Desktop\test\masks\20\600678\570023.png

I used the following code to generate the image tiles: gdal_translate -ot byte -of vrt image.tif filename.vrt gdal2tiles.py -z 20 –processes=8 filename.vrt tiles/

When I visualize the image and geojson, they are on the right place. They both are in the WGS84 coordinate system. I cannot understand that why that get different labels in the slippy map tiles.

Comment from Mehran66 on 20 June 2019 at 15:28

So after having a bunch of challenges with titling, here is what I did:

1- convert the image to tiles using gdal2tiles.py gdal

gdal_translate -ot byte -of vrt wms.tif filename.vrt
gdal2tiles.py -z 20 --processes=8 filename.vrt masks-512/

Note: Modify the gdal2tiles.py to generate 512*512 tiles

2- Convert the OSM GeoJson file to a binary Geotif using the following code:

from osgeo import gdal

vector_layer = r"C:\Users\Mehran\Desktop\test4\buildings.geojson"
raster_layer = r"C:\Users\Mehran\Desktop\test4\wms.tif"
target_layer = r"C:\Users\Mehran\Desktop\test4\wms_binary.tif"

# open the raster layer and get its relevant properties
raster_ds = gdal.Open(raster_layer, gdal.GA_ReadOnly)
xSize = raster_ds.RasterXSize
ySize = raster_ds.RasterYSize
geotransform = raster_ds.GetGeoTransform()
projection = raster_ds.GetProjection()

# create the target layer (1 band)
driver = gdal.GetDriverByName('GTiff')
target_ds = driver.Create(target_layer, xSize, ySize, bands = 1, eType = gdal.GDT_Byte, options = ["COMPRESS=DEFLATE"])
target_ds.SetGeoTransform(geotransform)
target_ds.SetProjection(projection)

# rasterize the vector layer into the target one
ds = gdal.Rasterize(target_ds, vector_layer, burnValues = [1])

target_ds = None

3- Convert the binary image to tiles using gdal2tiles.py (the same as step 1)

Comment from Orenshef on 30 June 2019 at 13:42

Hi again. I was wondering if there is a tool for doing the reverse of GDAL2Tiles, meaning once I have predicted on the tiles, I would like to output the segmented mask as a large tiff, same way it was inputted to me. Do you know how may I do that? (I see many images here that look like it has been done)

Comment from LogicalViolinist on 26 July 2019 at 20:26

“After making sure these are really background images and not just unmapped buildings in OpenStreetMap, we can put the negative samples into our dataset with a corresponding all-background mask. Then run rs weights again, update the dataset config, and re-train.”

How exactly is this done? You obtain imagery that has 0 buildings in it? then you add the imagery to images/20/…. ? I’m very confused by this part

Comment from daniel-j-h on 28 July 2019 at 14:25

Re. https://www.openstreetmap.org/user/daniel-j-h/diary/44321#comment45014

2- Convert the OSM GeoJson file to a binary Geotif using the following code: 3- Convert the binary image to tiles using gdal2tiles.py (the same as step 1)

The rs rasterize command rasterizes GeoJSON features into Slippy Map tiles.


Re. https://www.openstreetmap.org/user/daniel-j-h/diary/44321#comment45100

Once you have the predicted tiles you can serve them via http and point e.g. a Mapbox GL JS map to it. There is an example for the compare maps and the rs serve tool here:

https://github.com/mapbox/robosat/blob/1d0cf506cde4600ab7063c238f7f9e25d65ba611/robosat/tools/templates/map.html

You can serve the predicted tiles as simple as python3 -m http.server 5000


Re. https://www.openstreetmap.org/user/daniel-j-h/diary/44321#comment45254

You check out tiles where your predictions are not in sync with OpenStreetMap: where we predict e.g. a building but OpenStreetMap says there should be none. Because OpenStreetMap is not “complete” your model will

  • either predict a building where there is a building in the aerial imagery but not in OpenStreetMap: in this case your metrics will be lower since we count this as an error based on “ground truth” OpenStreetMap, or

  • your model predicts a building where there is no building in the aerial imagery or in OpenStreetMap. This can e.g. happen if you never train on images with swimming pools but now predict on images with swimming pools. In this case you can add these tiles into your dataset with a all-background mask nudging the model into the right direction and learning not to predict swimming pools as buildings.


Hope this helps; sorry for the delay it’s probably best if you join the robosat channel on the osmus Slack for quicker responses; there are quite a few folks there who are happy to help with questions.

Comment from minchacha on 20 July 2022 at 10:34

hello, I used robosat, and the prediction results obtained are as follows. The zoom level is 18 and 20. What could be the reason? thanks! I:\error.png

Comment from minchacha on 20 July 2022 at 10:38

hello, I used robosat, and the prediction results obtained are as follows. The zoom level is 18 and 20. What could be the reason? thanks! predict results

Log in to leave a comment