Clarify the use of start_date:edtf

This seems to be a pretty important point for this project, and to me it looks like it’s overlooked.

The current situation leads to mistakes because its not clear, how approximations should tagged.

The current documentation on Key:start_date could use an update. I would like to open a thread afterwards in the OSM-Forum as well, but I think it’s essential to first clarify how approximations are handled in start_date:edtf first.

It can or should also be decided for a style-guide like Date (mentioned by Support for century-format dates - #2 by Minh_Nguyen )

The examples could be greatly expanded and formatted into a clearer table for easier understanding. Below is an idea of what this could look like:

Column 1 Column 2 Column 3
What to describe How to tag Example picture
1700 start_date=1970 start_date:edtf=1970
Sunday, 14.07.1988 start_date=1960-07-14 start_date:edtf=1960-07-14
Sunday May 1, 1960 1 P. M. start_date=1960-05-01 start_date:edtf=1960-05-01T13:00
Circa 1849 start_date=1849 start_date:edtf=1849~
Beween 1906 and 1908 start_date=1908 start_date:edtf=1906/1908 For start_date=1908 and start_date:edtf=1906/1908 or [1906…1908] it should be decided to be a / separator
In the 18th Century start_date=1700 start_date:edtf=17XX
Mid 18th Century start_date=1750 start_date:edtf=1750~
Probably in the 18th Century start_date=1700 start_date:edtf=1700~
1890s start_date=1895 start_date:edtf=189X
In the fall of 1814 start_date=1814 start_date:edtf=1814-23 Why is this -23? For weeks? How can this be differed from YYYY-MM like 1814-12? How to know its month or week?
Thought to have been built in 1804 start_date=1804 start_date:edtf=1804?
Before 1800 start_date=1800 start_date:edtf=/1800 Same, should be decided be to be either /1800 or […1800] - imo /1800
Before a month (First mentioned in June in 1800) start_date=1800-06 start_date:edtf=/1800-06
After 1800 start_date=1800 start_date:edtf=1800/

Open questions

  1. Should the separator for ranges standardized? / vs [..]
  2. How to handle ambiguities like 1814-12 (week notation) versus 1814-12 (month notation)?
  3. Can the approximations removed from start_date and instead a link to start_date:edtf

Why this matters?

  1. Easier for mappers: Consistent guidelines remove errors
  2. Easier for developers: ohm.org and the josm-plugin and others can implement these standards more effectively.

After this it might be possible to revise the start_date wiki page to add a note like

If approximations are needed, please use start_date:edtf=*

This would benefit both OSM and OHM in the end, in the long term.

Looking forward to your thoughts and suggestions!

Best regards

1 Like

I agree that a style guide and summary would be desirable. For now, I’ve been consulting the Library of Congress specification regularly, but not everyone speaks English well enough to make sense of it. (And it’s poorly written in some places, and the most authoritative document is locked behind an expensive ISO paywall.) Moreover, some languages might have their own approximation phrases with different semantics than in English. We can cover these idiosyncrasies in translations of the documentation.

Some mappers feel strongly about using the set notation, which is more semantically correct and allows for gaps (like “from X to Y or from A to B”), but unfortunately it doesn’t allow for a mix of precisions or uncertainty, which is allowed in the interval syntax. So in practice, software will have to cope with a mix of notations.

The EDTF standard specifies 21 through 41 for seasons, seasons with the hemisphere specified, quarters, trimesters, and semesters.

I tend to use the location-independent seasons instead of specifying the hemisphere, just as I often omit the time zone when specifying the time, but this is sort of like omitting the country calling code or area code in a phone=* tag, so maybe we might choose to view this as an error in the future.

The week syntax in ISO 8601 is 1814-W12, which is unambiguous. I don’t think EDTF introduces its own week syntax, but maybe I’ve missed something? I see that the EDTF specification also allows 1814-12-XX to express month precision, but I think 1814-12 is more intuitive for us because that’s what we allow in start_date=* too.

You’re probably looking at OSM’s documentation on start_date=*. We have our own documentation. The link is kind of easy to overlook. In the past, when I attempted to call attention to OHM’s EDTF approach, I got mistakenly called out (et seq.) on the OSM mailing list for unilaterally redefining a longstanding OSM key. So it’s a bit sensitive.

Actually removing OSM’s approximation format would probably be a bridge too far for now, but I like the idea in the long term. Not everyone in OSM is happy with the currently documented approximation format. Once our practices around EDTF have solidified, we could start a discussion on the OSM forum about aligning their practices with ours.

1 Like

Mid 18th century is 1750~ ? That’s a bit peculiar to me.

Mid 18th century I would expect to be around 1730-1770.
1750~ I would expect to be between 1748-1752 or so. That’s quite a difference.

1 Like

That might’ve been an oversight on my part as I wrote up the examples. Institutions such as the University of Houston have published style guides that agree with your suggestion, so I corrected the example.

1 Like

Two suggestions to make the project a little more complex:

start_date:legal=1872-01-01

For example, the Duchy of Carinthia (Herzogtum Kärnten) introduced a network of 31 state highways on January 1st, 1872. Real road construction did not happen on that single day, it began long before this date and was finished much later. However, the legal creation of state highways (by a state legislature or administration) is an easy and important source for historical maps.

start_date:toll=1840-01-01

Historically, many highways were financed by toll / fees, so the first day of toll collection is usually considered as opening date.

This reminds me of the different notions of when a building comes into being. Many sources don’t distinguish when the building was approved, started construction, topped out (reached its full height), was completed, or began to be occupied years later. Or a restaurant typically has a soft opening followed by a grand opening, but later on no one ever talks about the soft opening.

If we know more, then we can do things like distinguishing the building under construction as a separate feature, but otherwise, we end up mapping the dates as given rather uncritically. I like the idea of being more explicit about the reason we chose to tag a particular date. I’d still suggest that we also tag the usual start_date=* and start_date:edtf=* alongside that, so renderers and other software don’t have to juggle all the possible subkeys. It’s sort of like how we still tag name=* even when we can tag language-specific subkeys.