Updating the "Sources" field to edit source:*= tags

Pedro_Leao · July 11, 2024, 8:00pm

Hello!
As of now, a lot of users use the source:*= tags, like source:name or source:date. The only way to edit these tags is by using the raw tag editor, which works, but these tags could have their own fields in the ui.

So I’m proposing a new design for the Sources field that includes a way to edit both the source= tag and the source:*= tags:

The idea is to have the field have a row for each tag. The main row has the same functionality of the field that exists now, it maps to the source= tag. The other rows each map to one of the source:*= tags, which is defined by the label on the left side and the value can be edited on the text input on the right side. This design is similar to the list view of the tag editor, but the user does not have to write the key itself, “source:url” for example.
There is a plus + button that can be clicked to add a new row, mapped to a new tag. When the feature is created, the field looks like this:

It does not have any extra rows and the plus button can be clicked to add one.
When there are extra rows, the exact tag that the user wants to edit can be selected with a combobox:

It only shows predefined tags that are supported, which are currently source:name, source:url and source:date, since they are the most used ones on taginfo. I thought about allowing users to insert whaterver tag in the format source:*= they wanted, but that would make the field have the same functionality as the tag editor, in which case it would be simpler to just use the tag editor itself.

I’m looking for feedback!
Do you think this change makes sense or is the Sources field functional enough the way it is currently?
Is the design understandable? What could change?
Should there be more predetermined tags? Which ones?

Pedro_Leao · July 11, 2024, 8:25pm

Also, according to taginfo, a few features are tagged with source:1:name and source:2:name tags, indicating that two different sources were cited for that feature.
The update to the Sources field could also include ui to create a secondary Sources field, that could map to source:2, for example, and allow users to add multiple unrelated sources.
The question is, would that be necessary? The update proposed here provides ui for the user to expand on the source they have for the feature and the recent update described here implements ui that allows the user to add a source to any other field that the feature has. All of these source-related input fields could be too much, and adding the option to create another field that maps to source:2 could be overwhelming.
What are your thoughts on this?

jeffmeyer · July 11, 2024, 10:24pm

I’m a frequent user of the multiple sources tags & I’d suggest they are (or should be) used as separate, but all related, sources. For example, if an old building in a similar location show up on three separate maps, or is described in 3 separate narratives, wouldn’t we want to know those references?

Oversourcing can certainly be a problem (one I’d love to have). But, under sourcing is a problem, as well. I believe the rule of thumb in academic research is to have at least 3 sources for any claim.

Kovoschiz · July 12, 2024, 12:41am

You can read through last Please don't forget to use the source=* tag (Case study!)

Minh_Nguyen · July 14, 2024, 11:21pm

I agree, we don’t need to reinvent the raw tag editor here. However, do you think the user might expect these subfields to behave like the raw tag editor, given the similar-looking button and two-column layout? If we don’t expect this field to contain an arbitrary number of subfields, maybe we could simplify it to show all the supported subfields at the same time? That would be consistent with the address and date fields, which remain compact by labeling each subfield using placeholder text.

source=* is currently much more popular than all of the subkeys combined. However, the user would see that they can add a name and a URL on other rows via the dropdown, so they may wonder what the topmost, full-width row is supposed to contain. Actually, I’m not sure I know the answer to that. If we formalize the use of source:name=* and source:url=*, then source=* seems like it would have no purpose in most cases. On the other hand, if source=* is supposed to contain a URL, as suggested when citing a tile server, then we shouldn’t offer a redundant subfield for source:url=*. But in that case, should the user omit source=* when citing an offline source or one that’s only accessible online via a subscription database?

Sometimes it is useful to cite multiple sources for the same fact, but I think a three-citation minimum for every fact may be an oversimplification. Maybe it depends on the field of study. In the archaeological and historical sources I’ve consulted, I tend to see only one citation per fact, sometimes more if the statement is a synthesis or an extraordinary claim. For example, the Newberry Library’s county boundary dataset typically contains only one citation for the whole feature, occasionally two.

If we ask for too much detail at once, we may risk discouraging mappers from citing their sources. When preparing an import or other bulk edit, you can take your time to perfect the one citation you plan to repeat across the entire changeset, but when you’re mapping a wider variety of features, citing a source needs to be something you can do quickly and then move on. As far as I can tell, so far, the vast majority of occurrences of source:# and source:name come from imports or other bulk imports uploaded through JOSM, not more bespoke edits in iD.

I expect that the improvements @Pedro_Leao is making will promote these subkeys to some extent, but I’m not very confident that we’ll get to 100% coverage without a more organized effort. On Wikipedia, if you add a bare URL as a citation, eventually a bot will come along and automatically fill in a title based on the page title, maybe some other details. Then another bot will make sure the Wayback Machine has archived the URL and link to the archive as a backup. If we can run a bot like this, then mappers don’t need to worry as much about getting all the details right on the first try. Even so, we’ll want iD to support these subkeys so that the filled-in details can be visible to the user.

Pedro_Leao · July 15, 2024, 10:27am

I like the idea of showing all of the supported subkeys at once! It would probably encourage users to fill them in because it would show new users that they exist. A way of doing it could be this:

What do you think?
About the general source=* key being used less often: That could happen, but I’m not sure it would be a bad thing. Using source:url=* instead of source=* for a website’s URL fells more structured, because the subkey itself only expects URLs. If we wanted to filter all URLs for a group of features, for instance, source:url=* could be used.
I’m thinking of leaving source=* available and tagged as a “General source” for people that still wanted to use it.

jeffmeyer · July 15, 2024, 1:54pm

Ah, yes… certainly. If there’s a single, definitive source, such as a treaty document defining a boundary, that’s enough. I believe that’s the type of reference used for the Newberry set.

I believe the rule of three exists where there is uncertainty and room for alternative explanations.

My main point for supporting multiple sources for a single assertion is that we should be able to support it when it is needed or could be helpful, not that it should be used in all cases.

Minh_Nguyen · July 18, 2024, 11:37pm

This could dovetail with an idea I floated to use source=* for a project-wide identifier of some sort that could point back to the wiki for more details. We definitely don’t have to support that right away, but if we can steer people towards the subkeys for things that definitely aren’t these identifiers, then eventually we’ll be able to optimize around the identifiers with more confidence.

If folks agree with this approach, once the dust clears, I could do a mass edit in the database to move things that look like URLs to source:url=*, and someone could fix ohm-inspector to look at that key.