Wikiconference NA 2025 trip report - need your thoughts

Would love thoughts and feedback on a recent visit with the Wikimedia crowd.

I just presented at WCNA 25, “OpenHistoricalMap: A Linked, Attribution- & Integration-Friendly Map Complement to Wikimedia.” (a very long-winded title)

About 40-50 Wikimedians with mild OSM and even fainter OHM familiarity attended the session and seemed engaged and responsive, including a few questions at the end, and a possible lead to work with the Vanderbilt University Spatial Analysis Research Lab.

The idea that current Wikimedia maps could do a better job in making underlying map data more readily available to their end users and that that data should have more clearly identifiable and traceable sources seemed to resonate with the crowd. I assured the crowd that we were no saints when it came to tagging sources, but that we were trying hard to fix that.

The crowd also seemed to appreciate how we have integrated their content in our left-navigation inspector/sidebar too create a richer experience for our users.

As I went through the creation of the slide deck, playing around with OHM data, and talking with potential Wikimedian OHM mappers, a few observations emerged:

  • Wikimedia (-pedia, -data, Commons) has a massive reach. Their content is just everywhere. They have a big staff. They have $150 million in the bank. If OHM was a host for data used to create Wikimedia maps, this could greatly expand our user base and content.
  • Source tags are critical, not just to help other mappers, but to help downstream users of our data.
  • We need better standardization of our source tagging schema. We have some pretty good isolation of source attributes, but it’s not clear that there’s a clean mapping to other / more universal citation schema (e.g., CSL) Not sure how we could move this forward, but perhaps adopting an actual schema of some sort, educating our community on how to use it, to create tools to make this easier, and to migrate our existing tags might bring some real value (along with the effort) among academia, & galleries, libraries, archives, and museums (GLAM), etc.
  • Our data needs to be easier to download. Sending people to OHM’s Overpass Turbo or OHM Ultra works reasonably well among adept users, but why not just have a “Download GeoJSON” for any node, way, or relation, or even for a geo-temporal bbox?
  • Map datasets need to be easier to style and share. People who aren’t mappers should be able to make maps with OHM data more easily. This is directly related to the first bullet, as well as to @Charlie_Plett’s great post Wishing for a Political Map.
  • Relations only get you so far for downloading related info. Whether you want to download all the points from a particular volume of the Green Books, or all of the facilities that were part of the 1964 World’s Fair, it’s not clear that relations are the best way to group things, although I’m not sure what tthe best alternatives might be.

Other action items popped up, including better maintenance or population of our relation IDs in Wikidata entries with P8424, but that’s more straightforward than the ideas above.

I’m curious if any of the ideas above resonate with our forum dwellers? :thinking:

1 Like

Congratulations on the well-received talk despite everything else that must’ve been on everyone’s minds this weekend. As usual, it’s listed on the OHM bibliography. I’ve also added it to a page on Wikimedia’s coordination wiki about opportunities for closer cooperation with OHM.

As I understand it, CSL is a tool for mappers, not OHM data consumers. The idea is that every system can have its own citation style that can be described by a common markup language. Popular tools like Zotero use CSL to translate their internal data structures into citations in our preferred format. Imagine pulling up a book on HathiTrust, an archaeological journal article, a Facebook post, or a map from David Rumsey’s collection, then pressing a button from the Zotero browser extension to copy perfectly formatted OHM source tags to your clipboard.

We’ve already made quite a bit of progress in standardizing our source tags over the past couple years. Now someone needs to formally describe what we’ve settled on:

iD steers mappers toward the preferred keys, but we still need to somehow adapt JOSM to our tagging scheme. Otherwise we’re likely to see JOSM users incorrectly replace e.g. name:source=* with source:name=* because of an unadapted validator rule.

On the topic of citations: This weekend at NACIS, the World Historical Gazetteer often preempted my attempts to plug OHM. To my surprise, the feature that came up most often was the ability to select a feature and see not only a citation corroborating the feature but also a listing of all the mentions of the feature in the literature. It’s like when you pull up an article on Google Scholar or another scholarly database and see both the articles it cited and the articles that cited it. Our tagging system can’t realistically accommodate such listings explicitly, but we could potentially build similar functionality by looking up wikidata=* or other external references in an external bibliographic service.

Taken together, these points suggest that it needs to be easier to query OHM data – yes, query. We inherited the model of a search box (imperfect as it is), a click-to-identify tool, and a tool to download everything within a bounding box. These tools rely on a mix of the OHM, Nominatim, and Overpass APIs, which provide much more options and functionality. By comparison, relations are not an effective tool for downloading related info, such as by source. It requires us to be the curators, to predict in advance all the possible ways that people might want to slice and dice our data. When we fail to foresee someone’s use case, they’re completely on their own.

Instead, we could make basic querying more accessible and intuitive. There’s no technical reason why the site couldn’t feature more robust options for visualizing a chronology, diffing all the features within a bounding box between two dates, searching by relevance to a Wikipedia article, or downloading all the features that share a certain tag with the selection. For those who need information rather than raw data, we could provide dedicated pages for some of these operations. And of course a shortcut to jump from an element page to the corresponding Overpass query for that element.

With the user in control of asking the questions, we don’t have to curate answers in advance in the form of easily outdated relations, and they might not even need to deal with element IDs at all.

3 Likes

Everyone… please check this out. :eyes: :thinking: :exploding_head:

1 Like

I think having better source tagging conventions would be very nice. I currently mostly use start_date:source, end_date:source and source:geometry.

For me start_event and end_event are also very powerful, I’ve thought of creating a timelapse of a country and displaying the start_event text at the bottom as the boundaries change.

Downloading data is hard, and trying to explain it to newbies that come to this forum is also difficult. I currently only use JOSM to download map data, it’s way easier than web based tools. I am currently making the Spanish Empire easier to query by adding empire=spanish and border_type to all the provinces, mayoralties, intendancies and so on.

What if we had a download tool that basically used Overpass but it would have a lot of buttons and checboxes for a user to select what he wants. The overpass queries would get written in the backend. The user would never even need to see the queries.

2 Likes

For ease of download, I will self-advertise the page I started Idea: OHM-Boundaries
Haven’t kept updating it myself https://wiki.openstreetmap.org/wiki/OpenHistoricalMap/Countries
If someone needs programming ideas, OHM OSM-Boundaries is an obvious one!

4 Likes

I definitely agree that we should serve common use cases like this without forcing users to manipulate raw queries. (It might be beneficial to expose them to querying tools, so they can go further.)

International and imperial boundary relations involve so much data that we might want to offer a prepackaged download on cloud object storage. Downloading the data dynamically adds extra wasteful strain to our servers. If we provide any link that downloads anything automatically, inevitably an LLM scraper will find the download link and monopolize our resources for no good reason.

We should also consider what people do with this data once they have it. GeoJSON on its own is useful for embedding in a MapLibre map, but no one is going to do that with such a large GeoJSON as the full British Empire chronology. Our boundary vector tileset is much more suitable for that use case. Another likely target environment would be QGIS. We could ask the QuickOSM plugin developers to build in shortcuts for the most common Overpass queries for OHM. The plugin could make it easier to filter by date using a GUI instead of figuring out the ISO 3601 sort order.

2 Likes

All your Wikidata talk gave me ideas of going Wikidata First. By that I mean first adding inception dates of settlements in Wikidata, the sources, etc. And then querying that data and importing it into OHM. Here is a slight version of my idea, but I have grander ideas yet. Using Wikidata to assist in mapping settlements

2 Likes