Please don't forget to use the source=* tag (Case study!)

Ok - you’ve made me realize that some (maybe just me?) OHM mappers have been iterating through trial and error to some practices that make sense to them without validating or checking in with the community.

So, before updating the clearly out-of-date wiki page for these tags, how about we have a discussion here?

Taginfo for source=* shows a lot of variation that could be standardized.

Here’s my proposal, with plenty of room for improvement - let’s hash it out. : )

  1. The main thing is to source your data. Any source tagging, regardless of format is better than no source tagging. Ideally, all of our source tagging would be standardized, but there are many forms of citing sources in the non-OHM data world.

  2. The sourcing should travel with the object. So, each object needs its own source information. This may seem onerous, but a commitment to this principle helps ensure that all downloaded data, regardless of filtering, is either properly attributed or could be attributed.

  3. Values in the keyspace [?] describe the key and not vice versa. E.g., start_date:source=* is the source for the start_date and source:3:date=* is the date (YYYY-MM-DD, of course) of source:3.

  4. More sources are better than fewer sources. To accommodate multiple sources, source tags can (should?) be enumerated. I believe there’s a hand-wavy rule in academia about getting 3 different primary sources for validation, but I’m not sure where that comes from or if maps always count as a primary source. Regardless, it’s helpful for tracking where object characteristics are sourced. So, the base key is something like source:[#]=* Where there’s no number, that’s fine, as others / later users could add sources with numbers and not overwrite those keys.

  5. Linking to sources is the most important way for online mappers to validate another mappers sourcing. So, the base source:[#]=* should point to a URL. This will speed other’s ability to answer the question of “why is that mapped that way?” And, “which version of the source did the mapper use?” It will also help clarify who is hosting that resource in the case of identical sources from different libraries, which may be in different conditions. Whether these should be Internet Archive links, I don’t know. I’d prefer not, just for readability, but that may require users to click a subsequent link / visit IA if the primary link is dead.

  6. Human readable names are helpful. So, including source:[#]:name=* should be text and ideally be either the name of the source as labeled by the source or a name that sufficiently identifies the source and differentiates from other similar maps. e.g. “1869 SF map” isn’t enough to tell whether that is the 1869 US Coast Survey SF Peninsula map or the 1869 Britton and Rey City & County of SF map or an altogether different 1869 map of San Francisco.

  7. Source attribution needs a better practice. I’m not sure what the right answer is here, but it would be best if we had some sort of way to track the exact attribution a source requests. That said, it might lead to an unnecessary amount of baggage on every object (yes, beyond what we’re already adding). So maybe a URL link to the attribution statement? It would be amazing to be able to auto-generate a list of sources for any bbox as well as for the entire database with some degree of accuracy that doesn’t depend on manual efforts. Maybe source:[#]:attribution=[url]?

  8. Georectification can create different results with the exact same source. Including a link to warped tiles or another warped source (e.g. IIIF) can help explain apparent discrepancies with an identified source. This link can also be used to enable other mappers to pick up where someone has left off and to start tracing items the original mapper may have left out. Hence, source:[#]:tiles=[tms x/y/z url]. Or, similarly, source:[#]:wmts=[wmts url].

  9. Other subkeys may be valuable and could be used, but might be less important. Examples include source:[#]:license=*, source:[#]:date=*, etc. I’m not sure if source:[#]:license=* is redundant to just license=*, so better thought is needed here.

  10. Source metadata should be included where relevant. Examples include information about markings on a map source, like a map key. source:[3]:ref=a) Ye Olde Malt Shoppe. Ideally, this is metadata about the source that might be informative to an OHM mapper, but that couldn’t normally be described by the object’s geometry. Another example might be source:[3]:id=[unique source id], to help directly tie the object to its source. This could be useful when importing datasets, especially if we wanted to subsequently report OHM-based modifications to the object to the data provider.

  11. Source metadata that’s not historically relevant or helpful should be discarded. Examples include things like “last modified” or “SHAPELEN” or “SHAPEAREA” of “created…”.

To summarize:

Encouraged:

tag value
source:[#]=* [url]
source:[#]:name=* [text]
source:[#]:tiles=* [tms x/y/z url]
source:[#]:attribution=* [url or text, esp. for datasets]
source:[#]:id=* [id text, esp. for datasets]

Optional:
Whatever makes sense to the mapper.

I’m sure there are plenty of concepts, goals, or practices I’ve missed or misstated, so please help get us to a better place. : ) (and now, I’m off to go ‘fix’ a few things…)

2 Likes