Does end_date=2024-05-01
indicate that May 1st was the feature’s last full day in existence (“full”), or does it indicate that the feature went out of existence at some point during May 1st (“partial”)? Does end_date=2024
mean the feature could have already gone away this year (partial), or that it definitely won’t go away until sometime next year (full)? The end_date
documentation is silent on the matter.
I’ve encountered many examples of either interpretation. Two of the largest boundary imports to date have set full end_date
tags: the city limits of San José, California, and the Newberry Library Atlas of Historic County Boundaries. Each member of a chronology relation has an end_date
that is one less than the following member’s start_date
. For example, Bullfrog County, Nevada, was abolished on May 3, 1989, according to the Newberry Library, but the boundary and its chronology relation have an end_date
of May 2. Nothing in the database says May 3 explicitly, because the county was abolished without replacement.
On the other hand, many individual features set partial start_date
tags, particularly when they aren’t part of chronology relations. For example, the Bermuda Railway ceased operation at some point on May 1, 1931. The end_date
value therefore matches the date given in the source. I think most laypeople would come up with this interpretation when mapping or using the map. It’s also consistent with how start_date
is always interpreted partially, not as the first full day in existence.
In the time zone boundary project that I recently completed, most of the changes took place at a specific time of day (2 o’clock in the morning local time). Since start_date
and end_date
don’t accept times of day, I set start_date:edtf
and end_date:edtf
to a matching value that includes the time. This is the partial interpretation. I didn’t consider the full approach at the time, but I think it would’ve resulted in some weird tagging. The end_date:edtf
of one boundary relation would have to come some fraction of a second before the next boundary relation’s start_date:edtf
. Both tags would fall on the same day; however, the former relation’s end_date
would specify the day that comes before the same relation’s end_date:edtf
. In some cases, a boundary lasted merely an hour, so end_date
would have to fall one day before start_date
– both on a single element.
If we prefer the full date interpretation, then much of the software stack will need to be modified. In iD, if you set the date filter to May 1, 2024, it includes features tagged end_date=2024-05-01
or 2024-05
or 2024
but excludes features tagged end_date=2024-05-02
or 2024-06
or 2025
. The validator also allows two features to overlap spatially if they have matching end_date
and start_date
values. For its part, the Leaflet time slider plugin interprets every start_date
or end_date
value as falling at the stroke of noon on that day. Regardless, if you set the time slider to May 1, 2024, it shows features with start_date
or end_date
set to 2024-05-01
or 2024-05
or 2024
.
If we prefer the full date interpretation, at least 967 chronology relation members and 84 chronology relations would need to be mass-retagged for consistency, according to QLever. If we prefer the partial date interpretation, at least 13,451 chronology relation members and 830 chronology relations will need to be retagged, the vast majority of them from the San José and Newberry imports. Either way, we need to manually review any element that isn’t part of a chronology relation, because there’s no way to know what the mapper intended just by looking at the tags.