OpenStreetMap

AI Generated Changeset Comments

Posted by FargoColdYa on 26 April 2024 in English.

Summary: What if AI creates the Changeset Comments? We could send locations, tag types, and quantities to get an output. AI would have to be run locally with small models for cost and be validated by the user.

Problem 1: Time I assume that 1,000 users create 2 changes in 1 day. We assume that each change set takes 3.5 seconds. 1000 users *2 changes * 3.5 seconds per change = 7000 seconds. OSM Users spend about 1.9 hours per day.

Problem 2: Skill Outsourcing Users should spend time on the things AI can’t do.

Problem 3: Server Side Peer Review We have human generated changeset comments. We could create AI generated changeset comments. We could ask the AI, “are these 2 changeset comments so different that it looks malicious”?

General AI Inputs: 1. Location: Where did the user map? 2. Feature Types: What tags did the user use?

AI Prompt: “You are an AI system. A user made edits in OpenStreetMap, a collaborative mapping project. They mapped locations[Mappleville, MN, USA; Bobville, MN, USA] with tags[50xSidewalks, 20xMarkedCrossings, & 10xReligous Areas]. You will create a changeset comment that concisely tells human reviewers what this changeset was about in 3 sentences or less. Exact numbers are not important. Changesets describe changes, so don’t request anything. Don’t mention anything that is common across all changesets.”

AI Response (https://www.meta.ai/): “Added sidewalks, marked crossings, and religious areas in Mappleville and Bobville, MN. Improved pedestrian and accessibility mapping. Enhanced local community information.”

Specific AI Inputs for Locations: 1. Cities[1 to 5], States[1 to 5], Countries[1 to 5]. 2. Is this a place with unclear boundaries? (What if somebody maps the ocean) 3. What is the size of the bounding box for this edit in KM?

Specific AI Inputs for Feature Types: Tags[1 to 6] & corresponding Quantities

Algorithms: 1. Sort the following tags by how frequently each was used in descending order and a limit of 5. 2. For each city, how often was each tag used? Create a table unless the table is huge.

Complexities of the process: 1. Disputed Boundaries: This was the changeset that changed the boarder. 2. Large Edits: Do not run this edit over changesets larger than 500 edits. 3. Malicious Inputs: Somebody named a building tag after a war crime. The AI received that as an input. What does the AI say? 4. Resource Allocation: Developer Time could be better spent doing something else. 5. Irregular Edits: I will use every tag in OSM only once. I will map an area the size of a continent.

Complexities of AI in general: 1. Uncommon Languages: Are these things only good at the 5 biggest languages? 2. Edit Safety: The user mapped religious areas in 2 different nations that share a disputed boarder and are in a war. 3. Money: Laptops with TPU’s are not common in 2024 (but will be in 2030). Mobile Editors with TPU’s are not common in 2024 (but will be on high end phones in 2030). Running AI costs money. Who will pay for it?

Solutions: 1. AI runs locally on a TPU. 2. If you use the outputs of an AI for changeset comments, you are responsible for safety.

Disclaimers: 1. I don’t work in AI. 2. I describe what I don’t have the resources to build. 3. I assume that developer resources should focus on high priority tasks.

Expected Development Difficulty: 1. Web to TPU is hard: Graphics have standard libraries (OpenGL). AI TPU’s are not common and don’t have standard libraries. 2. This can create giant tables if you are not careful.

The benefits of manual changesets: 1. Spam is harder to create in bulk. 2. Self reflection is encouraged. 3. Individuality is good to see. 4. Changesets are the alternative to the Change Approval Board (CAB meetings). It is supposed to take effort.

TLDR: OpenStreetMap (OSM) edits could be aided with AI-generated changeset comments, potentially saving users 1.9 hours daily. AI could analyze edit locations and feature types to generate concise comments, freeing users to focus on tasks that require human expertise. However, implementing AI-generated comments requires addressing complexities like disputed boundaries, TPU libraries, and malicious inputs.

Location: Rose Creek, Fargo, Cass County, North Dakota, United States

Discussion

Comment from SomeoneElse on 26 April 2024 at 17:02

We already have factual “this is just what was added” comments from things such as StreetComplete. That’s not so bad, in the context of a StreetComplete changeset, where it’s obvious that someone is answering questions on their phone (because that’s what StreetComplete is).

That would be less useful to generalise to other changesets, because it’s missing the “why”. A completely random sample shows that people do put a fair but of description into changeset comments that simply couldn’t be determined “by AI”, like here.

Comment from FargoColdYa on 26 April 2024 at 20:53

Hello @SomeoneElse. Thank you for the advice. These are great examples.

Comment from H@mlet on 29 April 2024 at 14:21

Hi.

I definitely don’t put into changeset comments information that can easily be found in the changeset itself, such as location, type of object modified / added / deleted…

So AI generated changeset comments sounds like a bad idea.

But AI generated changeset description, as an additional functionality (new tag in the changeset, or just displayed) might be nice.

Sometimes (especially on mobile) it’s hard to find out what the changeset is about by looking at it, so a few sentences description might be useful.

It could be a feature in OsmCha for example, or on the OSM.org website at some point.

Regards.

Log in to leave a comment