UTC is not enough, 24 hours is a long time / Epoch timestamps vs RFC 3339 strings
Posted by mnvrth@reddit | programming | View on Reddit | 53 comments
aristotlesfancypants@reddit
Idk about you guys but the problem of dealing with time is so simple in my mind.
Database: store everything in UTC.
Backend: manipulate everything in UTC, convert to some timezone when needed.
Frontend: convert UTC to user's timezone.
End of story. Am I missing something?
pihkal@reddit
Depends on your use case. For a lot of computer-only purposes, that’s fine.
But be aware there are human-oriented cases where ignoring TZ fails. To cite the example I mention elsewhere in the comments, say you’ve recorded an appt for 9am in 6 months. If you recorded it without DST then it may be an hour off when you display it later under DST.
If you tried to offset it based on the TZ when initially recording the appt, you can be off if the appt is in another TZ. (Or if the country’s govt decides to abandon DST.)
FluidBreath4819@reddit
why not just record using utc and display back using the user time zone (he gave in his profile or elsewhere) ?
pihkal@reddit
Scrolling thru old comments, and saw I never responded.
If you only record UTC in my example above, you don't have enough information to know whether to adjust the appointment for daylight saving time (DST) or not. Just knowing the user's TZ isn't enough.
E.g.: I live in a country with DST, and make 2 appts, one during DST, one not. I made a dental appt in June for November at 9am. Last November, I also scheduled my yearly medical checkup for another day in November at 9am.
If they were both recorded in UTC, one appt would have to be adjusted after DST ended, but not the other. And UTC alone doesn't give you enough info to know which one to fix.
You can either:
(Note that #2 is a more complicated way of arriving at #1.)
UTC-only is fine as long the only things you compare are computer logs. It's great for merging distributed cloud logs to get a global timeline of a network events.
But UTC-only is effectively lossy, and that has consequences given how humans use time.
And we haven't even gotten into things like countries not syncing their DST dates, or legal changes.
CalmLake999@reddit
Exactly how most people are doing it. Not really sure what they’re trying to fix here. Also good to store user device locations for time based alerts
jayerp@reddit
This is what I do too. I don’t see what the big deal is. SQL Server has a DateTimeOffset type which supports DateTime and offset but why would I want to persist offset? I am happy with keeping just UTC and letting the client convert to user desired local time (as set by a user preference or using the device timezone).
Seems like we’re over thinking this.
aristotlesfancypants@reddit
Yep, also the backend api should require clients to send & receive timestamps in UTC only, and it's the client's problem the conversion to some timezone.
jayerp@reddit
I’m fine telling my consumers that we send/receive in UTC. If we have an unknown date time kind, we will assume UTC. Format or don’t at your own risk.
Worth_Trust_3825@reddit
Opinion invalid.
synrgii@reddit
Or written as "Opinion = 0"
rco8786@reddit
Wanting human readable data doesn't invalidate any opinions. That's an entirely reasonable thing to build into a system.
CalmLake999@reddit
You are completely right; nothing wrong with strong timestamps. This persons using an upvote and downvote bot.
Worth_Trust_3825@reddit
Machine to machine communication is not reasonable place to have it built in to.
Dean_Roddey@reddit
Agreed. It's easy to show it to humans in human digestible form when needed, and vastly easier to manipulate and verify in binary form everywhen else.
toabear@reddit
I think all software developers should go on strike until we move to a five day week, 73 weeks a year. Nice and even. No more DST, and anyone who talks about DST gets shipped to the South Pole.
SheriffRoscoe@reddit
Now if only the year was a whole number of days long.
u0xee@reddit
In Futurama they pushed the earth's orbit slightly farther from the sun, making the year longer. This is clearly an adjustable parameter
Sability@reddit
If we total the amount of headaches caused by the earth year not being an exact whole number, it might be overall easier for us to do this
Thats not even considering that years are getting shorter over time: a few million years ago years were ~400 days long
synrgii@reddit
When counting those few million years, did they also adjust the years for the longer years, thus making the number of total years less again... wait my head hurts now.
SheriffRoscoe@reddit
"Brilliant!"
aristotlesfancypants@reddit
Just get rid of DSTs. 99% of problems solved.
lamp-town-guy@reddit
Don't forget about leap second!
toabear@reddit
We need a worldwide initiative to slow the earth just a bit.
azhder@reddit
If we all jump at the same time...
Dean_Roddey@reddit
Where is Speed Tugman when you need him?
funkachunk@reddit
RFC 3339 has been recently superseded by RFC 9557, which standardizes how to append a time zone ID to an RFC 3339 string, using the same format as Java.time, Noda Time, and other libraries including the coming-soon Temporal API in JavaScript.
So if you need to store both a UTC timestamp as well as a time zone, use an RFC 9557 string! Examples:
yodacola@reddit
So use the IANA tz info with your timestamps with some kind of versioning? Probably would be wise to store some locale info somewhere, too.
Depends on how much you care about time for users.
matthieum@reddit
Except... RFC 3339 is only good for past events.
Imagine that you have a meeting in 3 months time at 08:00 in the morning. You could store it in RFC 3339, but... this leaves you open to the issue that should the folks in charge decide that this year they're done with Summer Time, that's it, then you'll present yourself at 09:00 (assuming Northern Hemisphere) and your boss will be furious at you for missing the meeting!
If you want to store future events, you'll want NOT the projected UTC offset, but the location instead, so that you can recompute the UTC offset on the fly with up-to-date information.
(I did not learn this the hard way, don't know what you're talking about...)
i8beef@reddit
Having this discussion right now trying to point out that storing in UTC loses the original intent of the user forever, that the only long term good answer is to store the original timestamp + TZ info. Its a little painful trying to nail this home to dedicated data engineers who have tunnel vision on the benefits of a UTC timestamp for their needs (event processing / ordering mostly, where they need to avoid the TZ conversion at runtime for performance reasons). We'll end up storing BOTH for two different purposes.
fuhglarix@reddit
Exactly this. When a time is concerning a local event, the TZ is critical. If someone tells you “the game starts at 15:00” you need to know in what timezone?
An even more fun example is date and time of birth. If you don’t know where someone was born, you don’t actually know what point in time they were born. And in that case, it tends not to matter anyway. The local date is what matters.
I think it was Google or someone that tried to store dates of birth converted to UTC and then they were showing up incorrectly in calendars.
matthieum@reddit
Well... then again, DST regularly makes local time ambiguous.
Specifically, once a year, it's XX:15 twice during the day. In France, for example, on the last Sunday of October DST ends at 03:00 AM, and it's 02:00 AM again. So there's twice 02:15 on that day.
At least, in UTC it wouldn't be ambiguous...
fuhglarix@reddit
Haha, that’s a great little edge case to think about. And when switching to summer time, 02:15 that day won’t exist. Even 02:00 won’t exist. The clocks tick from 01:59:59 to 03:00:00.
I got excited when the EU voted to end daylight savings time and here we are many years later with no clear implementation timeline…
matthieum@reddit
I can't remember how many times I had to explain that to folks using an API I was maintaining:
Every year I'd have the same conversation again... I hate DST.
matthieum@reddit
I used to work on an "alarm clock" application, where client apps would schedule events (millions per day, up to 2 years ahead of time) and we'd notify them at the appointed time. Due to business constraints, events were scheduled in local time.
For our usecase, the best solution would have been to store local time immediately -- it's the master time -- and compute UTC time close to notification time only.
That is, at fixed intervals, close to the time of notification, one would query the time-offset table to get a view of all (local-zone, time-offset, next-change). Then, the UTC time of events still to be enqueued and scheduled prior to the end of the next period would be computed, and those events enqueued.
Delaying the computation of UTC time to next to the last moment allows to disregard updates to future offsets entirely. Notifications of changes usually occur months in advance, though sometimes only a few days. They have to give some advance notice.
If UTC has been pre-computed, then one has to locate all the records that need updating, and update them... which in our case would be millions at a time. It's painful for the database. By delaying, there's no problem: records for which it's been computed are not affected, and records for which it's not been computed are not affected either.
Lvl999Noob@reddit
I think it is better to store the representation that the user provided. If they said "8am on 3rd next month" then that's different from "8am after 31 days" and that's different from "4 hours before noon on 3rd of Apr". Now the question becomes, how to store it and how to use it.
(I actually don't have any experience in this and don't really know what I am talking about)
matthieum@reddit
You're correct in absolute terms.
In practice, though, most events are scheduled for a date and time, and don't need any further shenanigans.
Recurring events may require this logic. If a user needs to do something the 25th of each month -- paying the rent, for example -- then it's a kind of event that indeed requires "of each month" logic.
I never had to code this, though.
FeetPicsNull@reddit
Solvable by specifying future date with UTC offset. Then you just have to deal with whatever changes happen (but UTC offset is not ambiguous).
pihkal@reddit
UTC offset isn’t ambiguous, but if that’s all you stored, you can’t disambiguate the situation where some countries in that offset go on Daylight Saving Time, and others don’t. You won’t know which datetimes to update without the physical location.
FeetPicsNull@reddit
I live in California. We have PST and PDT depending on if Daylight Saving Time is in effect. Those two timezones are different offsets (UTC-08:00 and UTC-07:00). DST is one of the most braindead ideas of today, but in this case neither using the offset or timezones is ambiguous.
Do you have an example of a timezone that is ambiguous (genuinely curious). For example we have Arizona which does not do DST, they are always MST. When DST occurs for everyone else in Mountain Time, their offsets are now MDT but Arizona remains in MST.
What is ambiguous is if you say you want to meet in "Arizona" at a certain time on a certain day, because not all of Arizona remains on MST (I know, ffs people!). However, it is not ambiguous to say meet in Arizona at a certain time on a certain day AND include a valid timezones; it is more future proof to instead use the UTC TZ or a numerical offset since UTC doesn't change and the meaning of numbers don't change.
I hate timezones so, so, much. I really wouldn't mind starting work at "midnight" (because UTC-8 and such), if it meant we never had to discuss timezones again. People would adjust.
pihkal@reddit
The shorthand codes PST and PDT aren't ambiguous, but Pacific Time is also considered a time zone itself.
But if a government decides to adopt/abandon/change DST, the numeric offsets do change. Computers won't care, but nobody expects their next dental appt to change from 10am to 9am just because local DST laws changed.
FeetPicsNull@reddit
Oh I see what you're saying now. Thanks for giving me a solid example.
pihkal@reddit
No problem. The devil's in the details.
If you've never seen them, check out the classic datetime falsehood lists:
herrakonna@reddit
I restrict my timestamp string representations to RFC 3339 using the 'Z' format only (no offsets), and only whole integer seconds (except in rare cases when required for higher precision, but in such cases try to use a fixed number of decimal places with zero padding/fill).
This ensures that any given timestamp has a single lexical representation and the lexical ordering matches temporal ordering.
pihkal@reddit
That’s great for distributed logging and comparisons, but falls apart when you need to store info about human events at particular places.
E.g., I make an appointment for 9am in 6 months. Without knowing the time zone, you don’t know if the appt needs to be adjusted for Daylight Saving Time.
herrakonna@reddit
Well, yes and no. True, I would expect calendaring and similar scheduling apps to employ richer datetime information than RFC 3339 'Z', but that is a rather specialized case, and I would argue that using explicit offsets for scheduling rather than a timezone neutral UTC timestamp plus explicit location / locale would introduce its own problems, since we can't know the future and whether daylight savings will still be in use, or political decisions to move a locale to a different timezone, etc.
So even in use cases such as calendaring, storing the timestamp itself in a canonical UTC format (along with additional essential information such as location/locale independently from the timestamp) is a reasonable approach.
pihkal@reddit
It seems like a niche case only at small scales. Once your organization or software hits a certain geographical scale, you'll be forced to deal with it. I was exactly like you until my first international project.
First, I'm pointing out the value in storing time zones, not offsets, just so we're on the same page. Storing UTC offsets doesn't give you nearly enough information.
You don't need to know the future, you just need to store enough information at the time to be able to compensate.
I actually agree with you that locale info (e.g., "America/New_York") is superior to TZs... except that's not actually standardized as part of ISO8601/RFC3339, and is thus harder to work with in practice. You'll need to keep multiple columns in DBs in sync for it, langs won't have officials libs for it, etc. Next best option is time zones, which will enable you to compensate for a lot of future scenarios.
herrakonna@reddit
I understand and appreciate your perspective, but I do in fact deal with international services and stil find that it is best to store timestamps in a normalized UTC form, either string or epoch, and consider timezones to be a localization issue, including scheduling, since for any given event, you may have participants in multiple timezones so a single record for the event with timezone encoded timestamp is not sufficient general. Better to record the UTC timestamp that is the same for everyone and do what is needed for each individual user per their location / locale. The database record for the event itself can/should be agnostic as to timezone/locale.
azhder@reddit
Have had issues with the Z. Not like they had a problem in the past, as long as they had the server and clients in the same zone.
Then I came on board and I knew it’s bad, just was hoping it will not come right away.
But once a customer asked a support in two locations, different zones, even one of those zones not observing DST…
Let’s just say there is one rule of thumb: nothing can replace missing data well enough.
You may try to be smart and most of the time get it working OK, but it’s far easier to have had the data from the start and not have used it instead of having to invent/reconstruct what happened after the fact.
herrakonna@reddit
The point of having all timestamps in UTC with 'Z' format is that you never have to deal with timezone offsets, timezone differences, daylight savings time changes, etc.
If you need to compare timestamps with string representations that employ timezone offsets or other variations, you're going to have to use a library to parse the string representations into precise DateTime objects or epoch values or some other normalized form for comparison.
But if you have full control of a particular environment and want human-readable timestamp representations, adopting a canonical serialization as I detailed is in my experience a good way to go.
azhder@reddit
That's the problem we're trying to solve. We're trying to make sure that whenever we need to deal with them, they're there.
6502zx81@reddit
I always suggest using TAI time zone - the base of UTC without leap seconds and other suprises.
azhder@reddit
A very long way to still not end up saying it, so let me give you the missing summary:
Unix/epoch timestamp signifies a point in time, but ISO / RFC formates string with the time zone in it signifies an area in spacetime.
bert8128@reddit
Why store the utc offset? Why not just give the value in utc and save yourself a couple of characters. And for future events why not store the zone? Yes, time zones are complicated but removing the information dries not make life simpler. Now the situation is complicated and you don’t have enough information.