daniel-j-h's Comments
Post | When | Comment |
---|---|---|
Forgotten Gems in Geospatial Indexing | Hey so it’s not about the Z-Curve. It’s about the lesser known optimization technique (BIGMIN) that allows us to immediately skip over all the Z jumps. Please re-read the section above on the BIGMIN paper, it’s crucial to understanding why this is such a big deal. I got similar responses on Mastodon where many folks immediately responded with Hilbert curves or S2. It’s crucial to understand that the technique above (BIGMIN) allows us to make very efficient range queries on a Z curve pruning all irrelevant data. Maybe we geo folks have heard about space filling curves too much so that we immediately jump to conclusions and are not open for new discoveries. In the post on Z curves and Hilbert curves you linked they also fail to acknowledge that there is a way to prune the Z curve search space. It’s not simply scanning the full range of [zmin, zmax] representing the query bbox. In case code helps, here is the core of what I describe above
The mind bending insight here is: an array of lng/lats sorted by their z-value is all you need for an incredibly effective and efficient spatial index. That is wild. |
|
Forgotten Gems in Geospatial Indexing | I published a package on NPM under the zbush name implemented what I presented here
It’s work in progress, let me know what you think! |
|
Servus, Bayern: The robots are coming! | Hi! I’m no longer with Mapbox and therefore also no longer working or maintaining robosat. It’s not under active development for some 5+ years by now. I do not recommend using it anymore, especially if you are a novice or need a bit more hands-on help. All the best. |
|
GSoC 2023 - OSRM Bindings Project [Pt. 1] | Hey there, I was just pointed to your diary post here by a friend. I worked on OSRM between ~2015-2019ish and back then created an ABI/API stable C wrapper which also comes with bindings to other languages (via ctypes); mostly for playing around and seeing where this goes. https://github.com/daniel-j-h/libosrmc Now many years later this is quite outdated; but maybe it helps for inspiration. |
|
RoboSat ❤️ Tanzania | Re. osm.org/user/daniel-j-h/diary/44321#comment45014
The Re. osm.org/user/daniel-j-h/diary/44321#comment45100 Once you have the predicted tiles you can serve them via http and point e.g. a Mapbox GL JS map to it. There is an example for the compare maps and the You can serve the predicted tiles as simple as Re. osm.org/user/daniel-j-h/diary/44321#comment45254 You check out tiles where your predictions are not in sync with OpenStreetMap: where we predict e.g. a building but OpenStreetMap says there should be none. Because OpenStreetMap is not “complete” your model will
Hope this helps; sorry for the delay it’s probably best if you join the robosat channel on the osmus Slack for quicker responses; there are quite a few folks there who are happy to help with questions. |
|
Servus, Bayern: The robots are coming! | If you want to fine-tune pre-trained Robosat models then the weights saved in the Right now the model architecture is a encoder-decoder with skip connections. In addition we use a pre-trained Resnet50 for the encoder. Here are more details and paper references:
It is possible to
There are also some open pull requests where we experiment with different architectures: these architecture changes are not compatible with old Regarding your second question about estimating the dataset size: it depends on your use-case: the zoom level you want to work with, the geographical area (lots of variety on the planet vs. a single city or country), how good your ground truth labels are and if you want to invest time to do manual dataset curation or refinement, and so on. As a very rough guideline I recommend a couple thousand 512x512 px image tiles at least; then give it a try and see if it works. But as usual the more data the better. Hope that helps, |
|
Servus, Bayern: The robots are coming! | Do your images roughly look like the Bavaria images? Do buildings roughly look the same? Then I would just give it a try and see what you will get out. There is also an option in the If you want to try it make sure to convert your dataset into Slippy Map tiles on zoom level 17 (maybe +-1 zoom level) since the Bavaria model was trained on z17. There’s a robosat channel on the osmus Slack in case you run into issues :) |
|
RoboSat ❤️ Tanzania | The numbers in the tile files are slippy map tile x, y, z ids. See the OSM wiki and the docs on rs cover: The masks are generated based on the GeoJSON files you give it. It can happen that your rasterized masks and the aerial raster tiles you downloaded are not in sync, e.g. there could be more rasterized masks than you have downloaded raster imagery. In that case simply loop over both datasets and copy over the tiles for which you have both: a mask and an image. |
|
RoboSat ❤️ Tanzania | Robosat v1.2 comes with batched extraction and batched rasterization. Before we had to keep all the features and images in memory during extraction and rasterization which used up quite a lot of memory on larger datasets. We now support batched extraction and batched rasterization flushing batches to disk every now and then. The batches are somewhat arbitrary and not based on e.g. smaller areas. Batched extraction Batched rasterization And check the v1.2 release notes https://github.com/mapbox/robosat/releases/tag/v1.2.0 |
|
RoboSat ❤️ Tanzania | FYI there are also more recent diary posts; from the last two weeks: for folks following along this old Robosat on Tanzania drone imagery diary post. |
|
RoboSat ❤️ Tanzania | Hey the Then there are some limitations in
You have to go in and adapt these manually right now. I’m also happy for pull requests and can help you along if you want to properly fix these issues and make Hope that helps, Daniel |
|
RoboSat v1.2.0 — state of the art losses, road extraction, batched extraction and rasterization | Follow up post running robosat v1.2 on all of Bavaria’s 80 cm aerial imagery is here |
|
Servus, Bayern: The robots are coming! | Absolutely! For Bavaria there is only 80 cm aerial imagery (openly) available. You can check it out in iD by selecting it as a background, e.g. go to osm.org/edit#map=17/49.91130/10.89001 then go to Background (or press “b”) -> select “Bavaria (80 cm)” Now pan around and see if you can distinguish what’s a building, what’s a small building, a shed, a car port, a parking lot, large cars. It can be quite hard even for humans. I used robosat on drone imagery (see this diary) and on the high-resolution Mapbox aerial imagery in North America (where you could see people walking around) back when I was working for them. In addition to the resolution there are multiple tricks to get higher quality predictions out of it robosat trading off training time / runtime or requiring manual dataset curation. The use-cases I see for prediction even on the 80 cm aerial imagery are
|
|
RoboSat v1.2.0 — state of the art losses, road extraction, batched extraction and rasterization | I implemented polygonization for the parking use-case we had. It’s implemented in the Check out the robosat readme and the linked diary posts above - they go a bit more into detail how the pipeline works before and after the prediction stage. Here is the robosat parking polygonization - building are more or less the same and in fact I’m using the parking handler as a building handler. It’s a bit ugly if you want to handle edge cases such as (potentially nested) (multi-)polygons but oh well. The polygonization can definitely be improved by
Check out their work - it’s quite nice but they also run into edge cases and design trade-offs Happy to guide you along if you want to work on this in robosat or have ideas. |
|
RoboSat ❤️ Tanzania | Hey I just published a new release v1.2.0 - read about it here. The official docker images work now again, too. Here are the docs. For zoom levels there is an open pull request: https://github.com/mapbox/robosat/pull/88 It should Just Work (tm) but I haven’t had the time to test it more thoroughly. The problem there is we use some pixel-based thresholds and heuristics, and depending on your zoom level they will (slightly) change. The pull request implements these thresholds based on meters and no longer based on pixels. You can check out the code and help me test it by running it on your dataset, checking if results look reasonable, and playing around with the thresholds. Ideally we’d have also a building handler (which right now would do the same as the parking lot handler). I just haven’t had the time to implement it properly and myself I can just quickly hack the code the way I need it. Hope that helps. |
|
RoboSat ❤️ Tanzania | Segmentation faults are tricky to debug: could be anything from a bad installation to version mismatches to us not handling an edge case in your dataset. As a first step I recommend using the pre-built Docker binaries. The official ones are currently not getting built automatically - we’re on it to fix it. In the meantime I just set up automated Docker image builds for my fork which I keep in sync with upstream for the time being. You can run them via
Note for folks coming across this in the future: check the official mapbox/robosat docker images and use them if they are again up to date instead of danieljh/robosat. |
|
RoboSat ❤️ Tanzania | Multiple zoom levels work out of the box. They get picked up automatically in the dataset loader if you put them all into the same directory. I would try e.g. with zoom level z and then z-1 and z+1 first. If you have a bigger difference in zoom levels your visual features in the images will be vastly different and it might make sense to build multiple models - one per zoom level - instead. For the images and labels directory you will simply have multiple z sub-directories, as in:
Make sure the images and labels directory are in sync (for every image there is a label, and for every label there is an image) but otherwise that should be it. I highly recommend training on GPUs. With CPUs you will have to wait a ridiculously long time. Also manually verify your images and labels correspond to each other. |
|
RoboSat ❤️ Tanzania | If possible provide more building tiles data; the building IoU is https://en.wikipedia.org/wiki/Jaccard_index for the foreground class (buildings in your case) only.
|
|
RoboSat ❤️ Tanzania | Also what I’m seeing just now:
This doesn’t look right. You don’t want to give the background class and the foreground class the same color. Otherwise you will not be able to distinguish them visually. |
|
RoboSat ❤️ Tanzania | Yeap that looks pretty bad; you definitely need more negative samples. I’m wondering why you only get it for some tiles, though? Here’s a ticket for the all-background mask: https://github.com/mapbox/robosat/issues/43 Hope this helps. |