OpenStreetMap

Cleaning up cuisines in Canada with JOSM

Posted by Graptemys on 7 November 2022 in English. Last updated on 16 November 2022.

I often use apps using OSM data to find nearby restaurants to go to. However I noticed that most restaurants in my area do not have cuisine information, and many that do have typos or other non-standard values that can’t be processed by the apps I use. This makes it harder to find a place to eat. So I cleaned up all the cuisines in my country to improve the usability of the data that is already there.

Using JOSM I was able to do this at scale. I would do it for more places but it requires some knowledge of local cuisines. I left instructions at the bottom of this post so you can do it for your area if you like.

Process

First I researched all the existing uses (mostly using taginfo) and updated the Key:cuisine wiki to reflect current usage. With a clearer picture of what tags are in use and which are duplicates, I felt confident in sorting through thousands of tags and determining which ones had issues that needed to be fixed.

JOSM allows me to easily download and filter objects. The Tag Editor plugin allows me to easily edit a large list of objects. With these tools I was able to focus on making edits and not on repetitive tasks or on sorting through objects I don’t need to see. The exact steps and filters I used are below.

I want to be clear that I’m specifically trying to avoid making decisions about which tags should or should not be used, and I’m not making any decisions about what cuisines a restaurant serves. I’m merely matching the original mapper’s intent with current tagging standards. Mostly this means fixing typos and syntax, and occasionally moving the information to a different tag.

Examples

  • Brunch -> brunch. Fix the capitalization.
  • canadian -> no change. This value isn’t on the wiki because it doesn’t have much global use but clearly it is a reasonable value.
  • juice_bar -> juice. Change it to the very common tag documented in the wiki.
  • brewery -> no change. This place is probably better tagged as craft=brewery but that would require more knowledge about the specific place and about the differences between brewery, brewpub, pub, microbrewery, brasserie, etc. As long as this tag remains it would be easy for someone to undertake a project to clean that stuff up later.
  • Bar_&_Grill -> bar&grill. Change to a much more common way of spelling the same thing, even though it’s not common enough to be on the wiki.
  • chinese/canadian -> chinese;canadian. Fix the list syntax.
  • Cakes,_Bread,_Pastry -> cake;bread;pastry. Just fix the capitalization and list syntax and pluralization. Just like the brewery example, this place is probably better tagged as shop=bakery but that’s not a decision to make right now.
  • Contemporary -> contemporary. Just fix the capitalization. I don’t know what contemporary cuisine is and it’s rarely used (~10 uses globally) but it was originally mapped as that so just leave it.
  • pizza;donair -> no change. Donair is a regionally popular food.
  • sandwiches;seafood -> sandwich;seafood. Fix the pluralization.
  • greek_pizza -> greek;pizza. This is a place that serves greek food and pizza.
  • french_tacos -> no change. French tacos are a specific food that this place serves.
  • Alternative_healthy_meals -> Delete cuisine tag and move information to description tag.
  • gourmet_take-out -> gourmet. And set takeaway=yes.
  • plant -> This is a vegan restaurant, delete cuisine tag and add diet:vegan=only and diet:vegetarian=only.

Statistics

  • Worldwide, there were 1070297 objects with a cuisine tag and 59192 unique tags. Of those tags, 48685 (82.3%) appear only once, representing 4.5% of all objects. (Source: taginfo)
  • In Canada, there were 37909 objects with a cuisine tag and 1839 unique tags. Of those tags, 1384 (75.3%) appear only once, representing 3.6% of all objects. (Source: taginfo Canada database)
  • There were 1199 (3.1%) objects in Canada with cuisine tags that don’t appear on the wiki. (Source: Overpass API download and JOSM filter)

After editing:

  • In Canada, there are 1553 unique tags. Of those tags, 1107 (71.2%) appear only once, representing 2.9% of all objects.
  • There are only 697 (1.8%) objects in Canada with cuisines that do not appear on the wiki.

Changes made:

  • 385 objects were modified and now include only cuisines on the wiki.
  • 243 objects were modified but still include non-wiki values. Most of these are local foods or uncommonly used ethnicities.
  • 29 objects had their cuisine values deleted.
  • I also cleaned up a number of other issues like duplicated POIs, closed businesses, cuisines on non-food businesses, etc.

Edit: Here’s the changeset: https://www.openstreetmap.org/changeset/128574390

Steps in JOSM

  1. Enable Tag Editor plugin
    • Edit > Preferences > Plugins
    • Download list of plugins
    • Select tageditor plugin
    • Update
  2. Set Carto as basemap
    • Imagery > OpenStreetmap Carto
    • Layer view options (eye with gradient underneath) > Set opacity to about 25%
  3. Enable Overpass API download
    • Edit > Preferences
    • Select Expert Mode (in lower left corner)
  4. Download Data
    • File > Download data > Download from Overpass API
    • Query wizard > cuisine=* AND -(cuisine~"^(a|b)(;(a|b))*$") in "Your Location"
      • Replace both a|b with the long list of cuisines below
      • Replace Your Location with your country, state, region, or city
    • Build query
      • If you prefer, you can copy the whole query text from below
    • Download
  5. Filter cuisines so they disappear after they’ve been fixed
    • Windows > Filter
    • Add filter (plus in lower left of filter window)
    • Select case sensitive
    • Search string: cuisine~"(a|b)(;(a|b))*"
      • Replace both a|b with long list of cuisines below
    • Submit filter
    • If you find other common tags to filter, add them to both places in the filter
  6. Open Tag Editor
    • Edit > Search cuisine=* (or CTRL+F)
      • (A regular select all would also select the nodes that make up buildings)
    • Data > Tag multiple objects (or CTRL+T)
    • Set top bar to amenity, name, cuisine
    • Sort by name so that you can find your place later
    • Edit cuisine field directly
      • The rows will disappear if they match the filter
      • You can add more terms to the filter if you want

Text

For copy and pasting

The big list of cuisines on the wiki:

afghan|african|american|arab|argentinian|armenian|asian|australian|austrian|balkan|bangladeshi|basque|bavarian|belgian|bolivian|brazilian|british|bulgarian|cajun|cambodian|cantonese|caribbean|chinese|colombian|croatian|cuban|czech|danish|dutch|egyptian|english|ethiopian|european|filipino|french|georgian|german|greek|hawaiian|hungarian|indian|indonesian|irish|italian|jamaican|japanese|jewish|korean|lao|latin_american|lebanese|malagasy|malaysian|mediterranean|mexican|middle_eastern|mongolian|moroccan|nepalese|oriental|pakistani|persian|peruvian|polish|portuguese|romanian|russian|senegalese|serbian|southern|spanish|sri_lankan|swedish|swiss|syrian|taiwanese|tex-mex|thai|tibetan|turkish|ukrainian|uzbek|venezuelan|vietnamese|western|açaí|bagel|beef|beef_bowl|beef_noodle|beer|bubble_tea|burger|cake|chicken|chili|chocolate|churro|coffee_shop|coffee|couscous|crepe|curry|donut|dumpling|empanada|falafel|fish|fish_and_chips|fondue|fried_chicken|fries|frozen_yogurt|gyazo|gyros|hot_dog|hotpot|ice_cream|juice|kebab|meat|noodle|pancake|pasta|pastry|piadina|pie|pita|pizza|italian_pizza|poke|potato|pretzel|ramen|salad|sandwich|sausage|savory_pancakes|seafood|shawarma|smoothie|smørrebrød|soba|soup|souvlaki|steak|sushi|tacos|takoyaki|tea|udon|waffle|wine|wings|yakitori|bakery|bar&grill|barbecue|bistro|brasserie|breakfast|brunch|buffet|buschenschank|cafe|deli|dessert|diner|fast_food|fine_dining|fried_food|friture|fusion|grill|heuriger|international|local|lunch|pub|regional|snack|steak_house|tapas|yakiniku

The Overpass API query in case you need it:

[out:xml][timeout:90];
{{geocodeArea:Your Location}}->.searchArea;
(
  nwr["cuisine"]["cuisine"!~"^(a|b)(;(a|b))*$"](area.searchArea);
);
(._;>;);
out meta;

Discussion

Comment from Graptemys on 8 November 2022 at 15:05

I’m not really sure what Canadian cuisine means, but cuisine=canadian is sometimes used on places serving specifically Canadian or local foods like maple or poutine or donair or seafood, and sometimes it’s used on what otherwise might be called American food. It’s also possible some mappers used it where they probably shouldn’t. (“I’ll map this as cusine=chinese;canadian because it’s a Chinese restaurant… in Canada!”). Anyway the tag is not used much, probably because it doesn’t mean anything specific. I’ve never heard anyone actually say “Let’s get Canadian food tonight.”

I confirmed the greek pizza tags by looking at the restaurant’s web site. I included that example specifically to compare against the french tacos case, to show that it cannot be an entirely mechanical process, and that sometimes it takes research or knowledge of foods.

Comment from pnorman on 12 November 2022 at 05:20

As a Canadian, I’m not quite sure what Canadian cuisine is, but I know it’s a recognized category. See for example Yelp Canadian (New), “modern Canadian cuisine” at Brix & Mortar, or OpenTable Contemporary Canadian.

Comment from Claudius Henrichs on 13 November 2022 at 16:13

Thanks from a fellow QA Mapper for this effort, the necessary attention detail and most of all for sharing this writeup. I’m irregularly performing these tag reviews on the cuisine, religion, denomination tags, but just a few months later there’s always enough new tags to start over 😁

Comment from Hasnep on 13 November 2022 at 20:36

Thanks for the thorough instructions, I’m thinking I’ll try following them for my local area!

A possibility with the chinese/canadian tag is that they meant to distinguish western style Chinese food from authentic Chinese food. In the UK we have British Chinese food which is what you’ll find in a Chinese takeaway and different to more authentic Chinese food in restaurants like Din Tai Fung.

I don’t think it’s likely, but it crossed my mind so I thought I’d share. 🤷

Comment from seav on 14 November 2022 at 05:26

I know that this isn’t a mechanical edit but I’d like to point out the danger that unless you have personally visited or checked out the status of a retail POI (i.e., is it still open?), cleaning up tags would update the “updated” date of the object and some recency QA tools might interpret that as as a POI that no longer needs checking.

Comment from watmildon on 15 November 2022 at 03:27

Just finished running this workflow for my local area. What a great idea. Some cuisine tags that aren’t in the list but made sense to me were: “gelato”, “pho”,”terikayi”. There’s hits for all those in tag info but would love to know if folks think there’s ways to improve those entries.

Comment from seav on 15 November 2022 at 04:30

@watmildon, if you want to convert those values to something more general and something that’s more recognizable to users and tools/apps, I guess gelato can map to ice_cream, pho to vietnamese, and terikayi (which should actually be teriyaki) to japanese.

Comment from Graptemys on 15 November 2022 at 16:00

@watmildon I’m glad to hear you were able to do this for your area.

pho and gelato were terms I saw a lot too, but when I redid the wiki I had to take a global perspective and make decisions based on data, not my personal experiences. Turns out the tags just aren’t used very much, only 70 or 80 times total. When I’m mapping my area I would definitely use them, and when I’m looking for places to get pho in my city I would definitely appreciate having them there. But this process was more about cleaning up the tags that other people had already put.

By the way, I peeked at your changeset and there were a couple of changes I disagree with. Mostly on deleting tags. There’s no reason to change burger;fries to burger, or vietnamese;pho to pho unless you know the places really don’t serve fries or Vietnamese food, and there’s no need to delete bundt_cakes from a place that clearly serves bundt cakes even though it’s the only use of that tag. I would always tend to leave information there unless it’s actually causing issues.

My biggest concern is the change of a tag seafood to fish. Those mean different things, and looking at the restaurant’s website, it looks like they serve more oysters and clams than they do fish. Since this is cleanup work mostly focused on fixing spelling mistakes, you should avoid making changes that affect the meaning unless you can verify that the previous tag was actually incorrect.

Comment from Graptemys on 15 November 2022 at 16:50

There’s some discussion here about converting tags to be more easily used by third party tools, or if something is counts as a real cuisine. I see these as separate cleanup tasks, so here are some thoughts I have about that:

There’s obviously so much that could be improved about cuisine mapping generally. First of all, only 45% of restaurants even have cuisine information. Then a lot of the tags that do exist are imprecise, or list everything on the menu, or are missing critical information, or outdated, or very similar to other tags, or on non-restaurant POIs, or the restaurant has closed, or any number of other issues. I know it’s tempting to try fix everything you see when you do this, but I urge you to exercise restraint.

This process is designed to find and fix spelling mistakes. That’s all. You might see and want to fix other issues, and it might be possible to do it, but you will probably be missing a lot of critical context, and you certainly won’t find all the instances of that issue because the search wasn’t designed for it.

That’s not to say I didn’t do any other changes as part of my big Canada changeset. I marked some restaurants as closed, researched the difference between gelato and gelati, added a couple website tags, and more. But those all take a lot of time and are tangential to the main project of just fixing spelling mistakes.

Here are some examples of other cuisine cleanup projects that could be done, but not all of them should be. Think very carefully before doing any of them because they might violate the automated edits code of conduct.

  • Add cuisine information to restaurants that don’t have it
  • Review cuisines on non restaurant/fast_food/cafe POIs
  • Shorten cuisine lists with more than five entries
  • Reorder lists to put ethnicity first and food second
  • Give all branches of a chain restaurant the same cuisine
  • Convert all cuisine=beer tags to drink:beer=yes
  • Add the tag cuisine=pho to any Vietnamese restaurant with “Pho” in its name
  • Decide Canadian cuisine is fake and then delete all cuisine=canadian tags

The point is that all of these are better done as separate projects. And also they are easier to do if the spelling mistakes are cleaned up first.

Comment from watmildon on 15 November 2022 at 20:54

I’m super new to this kind of editing work. I’ll definitely review my edits. Keeping on task (correcting typos etc) and resisting the urge to fiddle (in many cases somewhat haphazardly) is a real issue. This is all extremely well thought out and I appreciate the review and feedback from y’all.

Log in to leave a comment