Date and Time Manipulation

Why would anyone write a blog post about time manipulation? We all understand time… 24 hours a day, 7 days a week, right?

LIES! The truth is, time is a mess, an arbitrary cyclic cascading relative political mess, and it just gets worse as time passes and more exceptions are made to previously established rules.


Content


Brief History of Time and Necessity of Standardization

Before clocks were invented, the way we all kept time in sync was with the sun, moon, and other celestial objects. After all, clocks are just calibrated loops, basically, and rotation of the earth is an obvious loop that we all have in common.

Let’s go back in time, not far…

Prior to the early 1800s, synchronizing time was never really a problem we had to face, since prior to then, it wasn’t feasible for people or information travel from place to place in a fast enough manner for it to matter. Are you trekking a couple hundred miles? It probably won’t matter if your clock drifts a bit between where you left and where you’re going, just adjust when you get there.

In the early-mid 1880s mechanized rail infrastructure was just beginning to expand. It was difficult to keep time in sync relatively to others when traveling quickly, which, when one really considers it, is a very recent possibility for humans, relative to the rest of history.

Time-zones were created to manage that complexity. When you cross a threshold for a time-zone, you either move forward or backwards by some arbitrary amount of time, depending on your direction of travel, and where you’re moving to/from. We all understand that, and agree upon it, mostly.

Most of the time-zones on land are offset from Coordinated Universal Time (UTC) by a whole number of hours (UTC−12 to UTC+14), but a few zones are offset by 30 or 45 minutes (for example Newfoundland Standard Time is UTC−03:30, Nepal Standard Time is UTC+05:45, and Indian Standard Time is UTC+05:30). Some higher latitude and temperate zone countries use daylight saving time for part of the year, typically by adjusting local clock time by an hour. Time-Zone / UTC Offset Map UTC Map Example of various local times prior to establishment of UTC localtime example

So, we have local time everywhere with a mess of complexity and arbitrary offsets from local time, or roughly local time everywhere, with known offsets from a standard epoch. I’ll take the latter, please, thanks.


Standards

As programmers, we are all logically minded, and can understand the necessity to agree upon standards.

Epoch

Epoch just means “reference date”.

Coordinated Universal Time <- Important: USE IT!

The most important standard we’re going to discuss is Coordinated Universal Time aka UTC, which is nice because it doesn’t have time-zones, daylight savings time, or other localized variations of time.

Note, that there are several variants of Universal Time, and UTC is not necessarily the same as GMT, though they are similar. Technically, UTC is based on UT1. and UT0 is not commonly used these days.

UT0

UT0 is Universal Time determined at an observatory by observing the diurnal motion of stars or extragalactic radio sources, and also from ranging observations of the Moon and artificial Earth satellites. The location of the observatory is considered to have fixed coordinates in a terrestrial reference frame (such as the International Terrestrial Reference Frame) but the position of the rotational axis of the Earth wanders over the surface of the Earth; this is known as polar motion. UT0 does not contain any correction for polar motion. The difference between UT0 and UT1 is on the order of a few tens of milliseconds. The designation UT0 is no longer in common use.

UT1

UT1 is the principal form of Universal Time. While conceptually it is mean solar time at 0° longitude, precise measurements of the Sun are difficult. Hence, it is computed from observations of distant quasars using long baseline interferometry, laser ranging of the Moon and artificial satellites, as well as the determination of GPS satellite orbits. UT1 is the same everywhere on Earth, and is proportional to the rotation angle of the Earth with respect to distant quasars, specifically, the International Celestial Reference Frame (ICRF), neglecting some small adjustments. The observations allow the determination of a measure of the Earth’s angle with respect to the ICRF, called the Earth Rotation Angle (ERA, which serves as a modern replacement for Greenwich Mean Sidereal Time).

UTC

UTC (Coordinated Universal Time) is an atomic time scale that approximates UT1. It is the international standard on which civil time is based. It ticks SI seconds, in step with TAI. It usually has 86,400 SI seconds per day but is kept within 0.9 seconds of UT1 by the introduction of occasional intercalary leap seconds. As of 2015, these leaps have always been positive (the days which contained a leap second were 86,401 seconds long). Whenever a level of accuracy better than one second is not required, UTC can be used as an approximation of UT1.

TZ Database

Time-zones are represented as an offset from UTC, which are documented in the tz database. Thanks to Arthur David Olson, Paul Eggert, and many others from the community, we have this thing. The tz database is actually mildly interesting if you start digging into it. I will put a few humorous examples at the end of this article.

Unix philosophy

Let’s talk a bit about unix philosophy, and why datetime module in Python is the way it is, and the TZ databases exist independently.

The Unix philosophy is this:

The emphasis here is: do one thing, and do it well, and write programs to work together. Universal interfaces, like UTC, are cool, too!

RTFM

Yes, I said it. Read, the, freaking, manual! It’s astounding how often many developers fail to do this simple first step before undertaking a new task…

Python’s datetime module

The datetime module supplies classes for manipulating dates and times in both simple and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient attribute extraction for output formatting and manipulation. For related functionality, see also the time and calendar modules.

There are two kinds of date and time objects: “naive” and “aware”.

An aware object has sufficient knowledge of applicable algorithmic and political time adjustments, such as time-zone and daylight saving time information, to locate itself relative to other aware objects. An aware object is used to represent a specific moment in time that is not open to interpretation [1].

A naive object does not contain enough information to unambiguously locate itself relative to other date/time objects. Whether a naive object represents Coordinated Universal Time (UTC), local time, or time in some other time-zone is purely up to the program, just like it’s up to the program whether a particular number represents metres, miles, or mass. Naive objects are easy to understand and to work with, at the cost of ignoring some aspects of reality.

For applications requiring aware objects, datetime and time objects have an optional time-zone information attribute, tzinfo, that can be set to an instance of a subclass of the abstract tzinfo class. These tzinfo objects capture information about the offset from UTC time, the time-zone name, and whether Daylight Saving Time is in effect. Note that no concrete tzinfo classes are supplied by the datetime module. Supporting time-zones at whatever level of detail is required is up to the application. The rules for time adjustment across the world are more political than rational, and there is no standard suitable for every application.

[1] If, that is, we ignore the effects of Relativity

Caveats:
Unlike the time module, the datetime module does not support leap seconds.

Leap Seconds (not going there)

Oh, yes, speaking of leap seconds, we’re not even going to go there in this blog post; just as complex as large time scales are, so are small time scales.

It’s all relative… What if you’re tracking time for a satellite, relative to a ground station? Maybe we’ll address these in another blog post, but for now, we’re going to focus on Daylight Savings Time transitions, time-zone conversions, overflows, and off-by-one errors, because these are the most common mistakes I see.


Programming Problems

Backstory

So, primed with this knowledge, let’s look at some actual programming problems. Our target language here is Python, but these patterns are language agnostic.

Recently, I got into a long discussion with the author of a Python time manipulation library… That’s what encouraged me to write this blog post.

Common misconception

You can’t handle time-zones properly with the standard library.

>>> d = delorean.Delorean(datetime(2013, 3, 31, 1, 30), 'Europe/Paris') >>> d.datetime.isoformat() '2013-03-31T01:30:00+01:00' >>> d = d + timedelta(hours=1) >>> d.datetime.isoformat() '2013-03-31T02:30:00+01:00' >>> d = pendulum.create(2013, 3, 31, 1, 30, tz='Europe/Paris') >>> d.isoformat() '2013-03-31T01:30:00+01:00' >>> d = d + timedelta(hours=1) >>> d.isoformat() '2013-03-31T03:30:00+02:00'

Disregard that he said "standard library" and then proceeded to show two examples of third party modules. For any seasoned Python developers, the flaw here is likely obvious… The Python standard library does indeed support time-zones, but it just doesn't include those time-zone definitions by default, which isn't a bad thing, at all.

Always use UTC when manipulating time.

That's it! It’s that simple. All time manipulation operations should be done in a common time-zone, preferably UTC, and then time should be localized, if desired, when displaying to end-users. This is a notoriously common pain point for many programmers, both experienced, and new. It is very difficult to manage the state of time-zone transitions when doing manipulations in local time, and this simple pattern will save you much grief, I promise!

UTC isn't a silver bullet, and doesn't solve all of the problems, but it's an important foundation; you don’t just have to take my word for it either…

Caveat: After receiving some feedback from concerned readers, I would like to point out, since it wasn't obvious, that using UTC doesn't always make sense, such as recurring appointments across different time-zones. If one isn't careful, your appointments can be incorrect after DST transition. This happens pretty often, as I will point out below, with some examples of companies that have had this exact problem. So, I'd like to clarify... Don't use UTC blindly. In this particular example, yes, you would still use UTC to do the time-zone translations, but you wouldn't want to store your appointment times in UTC. Be mindful! It's not about using UTC or not, it's about how you use.

Development Community Guidelines

Using UTC is pretty much standard practice, and for good reason!

W3C recommends to always use UTC whenever possible:

Date and time values based on incremental time are time-zone-independent, since at any given moment it is the same time in UTC everywhere: the values can be transformed for display for any particular time-zone offset, but the value itself is not tied to a specific location.

It is good practice to use an explicit zone offset wherever possible. If one is not available, best practice is to use UTC as the implicit zone offset for conversions of this nature. This is because the values are exactly centered in the range of possibilities and because representation internally (as computer time) is usually based on UTC. Since a single reference point has been used it may be possible to unwind the change later even if erroneous conversion takes place. When working with multiple documents from various sources, the “implicit” offset of the document may vary widely from that of the implementation doing the processing. If UTC is widely used, the chances of error are reduced.

and Microsoft:

Coordinated Universal Time (UTC) is a high-precision, atomic time standard. The world’s time-zones are expressed as positive or negative offsets from UTC. Thus, UTC provides a kind of time-zone free or time-zone neutral time. The use of UTC time is recommended when a date and time’s portability across computers is important. […] Converting individual time-zones to UTC makes time comparisons easy.

and Django:

Even if your website is available in only one time-zone, it’s still good practice to store data in UTC in your database. The main reason is Daylight Saving Time (DST). Many countries have a system of DST, where clocks are moved forward in spring and backward in autumn. If you’re working in local time, you’re likely to encounter errors twice a year, when the transitions happen. […] This probably doesn’t matter for your blog, but it’s a problem if you over-bill or under-bill your customers by one hour, twice a year, every year. The solution to this problem is to use UTC in the code and use local time only when interacting with end users.

Proof that it works

I went on to explain this to the author of Pendulum, with examples:

>>> paris_time = Delorean(datetime(2013, 3, 31, 1, 30), 'Europe/Paris') >>> paris_time_utc = paris_time.shift('utc') >>> paris_time_utc_delta = paris_time_utc.datetime + timedelta(hours=1) >>> tz_aware_delta = Delorean(paris_time_utc_delta).shift('Europe/Paris') >>> >>> paris_time Delorean(datetime=2013-03-31 01:30:00+01:00, timezone=Europe/Paris) >>> >>> tz_aware_delta Delorean(datetime=2013-03-31 03:30:00+02:00, timezone=Europe/Paris)

To which he responded:

I just don’t agree, even though it simplifies a lot datetime operations.

Yep, friends, it makes it simple because that’s the way you’re supposed to do it! Why fight it? Embrace it. Let it wash over you in its wonderful simplicity.

Delorean

Delorean is a library that is good at one thing, and does it well… Time manipulation. It works with the standard library datetime module, it doesn’t replace it or change default behaviors. It extends datetime with the tzinfo data so you don’t have to manage it, and saves you many lines of repetitive bootstrap code each time you want to manipulate a non-naive datetime.

Pendulum

Pendulum aims to fully replace datetime, timedelta, pytz, dateutil, and Delorean in one fell swoop. Never mind the blatant disregard for Unix philosophy, however, it seems the author thinks all the other solutions are broken. That seems a little far-fetched doesn’t it? Is it likely that so many other developers who use other libraries are wrong, and nobody noticed? Or is it likely that an individual is misunderstanding something? Let’s not forget how hard time is… and how easy it is to mess up.

The Pendulum FAQ states this at the bottom:

Even though both Arrow and Delorean seem faster, they do not handle properly timezones, if not at all. Delorean's epoch() assumes UTC and does support specifying a timezone, Arrow's get() does not support a timezone and Arrow's fromtimestamp() does not behave properly (See Why not Arrow?).

Huh? Why would anyone want to change the epoch time-zone? The epoch is UTC, it doesn’t change depending on locality. It’s the same time, everywhere, always, in UTC.

He elaborated:

there is no way in delorean to get a datetime corresponding to a timestamp in a specific time-zone like the stdlib fromtimestamp

Well, first of all, why use Delorean for something the standard library can do on its own?

And secondly, this assertion is patently false, as shown below. Delorean works just fine for this task:
>>> delorean.epoch(1477459754).shift('US/Pacific') Delorean(datetime=2016-10-25 22:29:14-07:00, timezone=US/Pacific) >>> delorean.epoch(1477459754).shift('US/Pacific').datetime datetime.datetime(2016, 10, 25, 22, 29, 14, tzinfo=<DstTzInfo 'US/Pacific' PDT-1 day, 17:00:00 DST>)

Finally, the author admitted that he did not understand the purpose of Delorean until now, and assumed it was supposed to replace the stdlibs, like his library.

However, he still insisted that Delorean was wrong, and his library was correct, and went on to try to show me how:

>>> d = delorean.Delorean(datetime(1940, 7, 1), 'Europe/Amsterdam') >>> d.datetime.dst() // timedelta(minutes=1) 100

This is wrong, it should be 120. And using UTC will not fix that.

>>> d = delorean.Delorean(datetime(1940, 7, 1), 'UTC').shift('Europe/Amsterdam') >>> d.datetime.dst() // timedelta(minutes=1) 100

In this case pendulum behaves properly:

>>> d = pendulum.create(1940, 7, 1, tz='Europe/Amsterdam') >>> d.dst() // timedelta(minutes=1) 120

These case are not widespread thankfully but I thought I’d mention it.

Something is definitely wonky, here.

This is a perfect example of how weird time is! Yes!

First, let’s figure out what we should expect here. We can reference the Amsterdam DST schedule for the 1940s.

May 16, 1940 - Daylight Saving Time Started When local standard time was about to reach Thursday, May 16, 1940, 12:00:00 Midnight clocks were turned forward 1:40 hour to Thursday, May 16, 1940, 1:40:00 AM local daylight time instead

Year DST Start DST End
1940 Thursday, May 16, 12:00 Midnight No DST End
1941 DST observed all year
1942 No DST Start Monday, November 2, 3:00 AM

Hey, wait a second…

We just found a bug in Pendulum!

DST Transition Handled Flawlessly

Let’s try out some more code… Using our rule of thumb to always use UTC when doing time manipulation.

>>> amsterdam = Delorean(datetime(1940, 5, 15), 'Europe/Amsterdam') >>> amsterdam Delorean(datetime=datetime.datetime(1940, 5, 15, 0, 0), timezone='Europe/Amsterdam') >>> >>> amsterdam_utc = amsterdam.shift('utc') >>> delta = amsterdam_utc + timedelta(days=1) >>> >>> delta Delorean(datetime=datetime.datetime(1940, 5, 15, 23, 40), timezone='UTC') >>> >>> delta.shift('Europe/Amsterdam') Delorean(datetime=datetime.datetime(1940, 5, 16, 1, 40), timezone='Europe/Amsterdam') >>> >>> delta = amsterdam_utc + timedelta(days=901) >>> delta.shift('Europe/Amsterdam') Delorean(datetime=datetime.datetime(1942, 11, 2, 1, 40), timezone='Europe/Amsterdam')

Now approaching the DST transition…

>>> delta = amsterdam_utc + timedelta(days=902) >>> delta.shift('Europe/Amsterdam') Delorean(datetime=datetime.datetime(1942, 11, 3, 0, 40), timezone='Europe/Amsterdam')

It works just fine! The results when using Delorean are in fact correct and as expected. Thankfully, in the end, I was able to lead the author to realize his mistake, and he is going to fix this DST bug.


Summary

Please stop using non-UTC datetimes for your time manipulations. Be explicit. Use UTC.

If you want to build a library that handles this transparently, sweet! Please read the other libraries that already exist, understand them, and if they are so bad that you can’t fix them and must create your own, please, read PEPs 431, 495, and 500 before you go any further. PEP 495 is actually implemented, as of recently. Finally though, leverage the time-zone database, and UTC, that’s why they exist. Simple is better than complex.

Of glass houses and stone throwing

The worst part of all this is, the author of Pendulum seems to be on a crusade against competing libraries.

Don’t tell the community that a competing library is doing it wrong, when in fact those libraries are doing it correctly, and your own library didn’t even support your example until a few days prior, and has other critical bugs like the DST transition example above.

Innovation is critical, but be careful

My intent with this piece is not to discourage innovation, it’s great that someone wants to make things and share them with the community, but when the drive and desire to create said library is derived from fundamental misunderstandings, one cannot help but worry.

To summarize, as should be evident, time is complicated… it’s easy to make mistakes. This isn’t a slight against the author of Pendulum, but a warning to all developers. This my friends, is why you should avoid re-inventing the wheel if you can help it.

Easy mistakes mean common problems

This isn’t the first time we’ve come across libraries with time bugs, and definitely won’t be the last.

greenclock is a greenlet-based task timer. While testing I was able to reproduce the following exception: ValueError: hour must be in 0..23 when using start_at='once' between 2300-2400 hrs.

Luckily the 11PM bug in greenclock was an easy fix. But still further datetime issues remain. If you run that on the last day of the month… ValueError: day is out of range for month Oops!

Unfortunately, it’s not uncommon for many programmers to have major time problems at some point or another, if not repeatedly, depending on the problem domain. Yep, even companies with dedicated QA teams and million multi-dollar budgets.

Notorious Time Bugs

Microsoft

For example, back in 2007 Microsoft had some Outlook bugs that resulted in events moving to a different day during daylight savings time!

Microsoft had more bugs again in again in 2012, again in 2013, and, yep, more UTC bugs in 2014.

Apple

Apple, too, has had time problems, where, in 2010, alarm times did not update during DST transitions, and again, in 2013. Even recently with IOS8 many users have had issues with their calendar app.

Banks

In 2010 Bank of Queensland had a problem due to an incorrect hexadecimal number conversion routine causing terminals to decline customers’ cards as expired because the terminals jumped from 2010 to 2016. This actually happened with banks across the world in 2010, for the same reason, flawed Hex/Decimal conversion. Over 20 million bank card verification chips failed as a result.

NASA?!?!

This is a serious problem, folks... NASA even lost a spacecraft in 2013, a modern one, even, that was launched in 2005, due to a time overflow bug. Yes, even a $267 million dollar spacecraft had a simple bug like this. We are truly doomed!

Air Traffic Control Systems

Time problems also happen in hardware too… in 2004, LAX air traffic control lost radio communications with pilots due to a time bug. They have microcontrollers that tick down in milliseconds to keep sync between ATC and the planes. FAA guidelines suggest resetting the clocks every 50 days, giving 3 weeks of time before they would have ran out. However, apparently nobody built in safeguards to prevent that threshold from being met, and chaos ensued (or they probably just switched to air band and did it the old fashioned way, but I digress).

Avionics Systems

Boeing 787's have an overflow bug that is horrifying. If they don't power cycle the system every 248 days the plane will lose all AC power because the generators will go into fail-safe mode. Yes, that's the same Boeing who manufactured the dead space probe that was mentioned above.

Crazy Enterprise Time Hacks

Here’s an example of some of the craziness that people come up with as work-arounds for weird time disparities on real-time operating systems (or real-ish time, think ERP software, etc…):

OpenVMS is a multi-user, multiprocessing virtual memory-based operating system (OS) designed for use in time sharing, batch processing, and transaction processing. When process priorities are suitably adjusted, it may approach real-time operating system characteristics. The system offers high availability through clustering and the ability to distribute the system over multiple physical machines. This allows the system to be tolerant against disasters that may disable individual data-processing facilities.

OpenVMS represents system time as the 64-bit number of 100 nanosecond intervals (that is, ten million units per second; also known as a ‘clunk’[41][42]) since the epoch. The epoch of OpenVMS is midnight preceding November 17, 1858, which is the start of Modified Julian Day numbering. The clock is not necessarily updated every 100 ns; for example, systems with a 100 Hz interval timer simply add 100000 to the value every hundredth of a second. The operating system includes a mechanism to adjust for hardware timekeeping drift; when calibrated against a known time standard, it easily achieves an accuracy better than 0.01%. All OpenVMS hardware platforms derive timekeeping from an internal clock not associated with the AC supply power frequency.

The way they compensate? Increase or slow down the clock speed.

For example by 25% for 2 hours before and after the daylight savings time transition, or by 50% during the double hour. This is pretty standard practice, actually, in the ERP world. It does make sense, if you think about it, simply speed up or slow down the hardware clock for a nice smooth transition. The end result is the same, and the transition is fluid. Ahh, relativity.

They should just set their system clock to UTC, right?! Unfortunately, this isn’t always possible, depending on the platform.

The Future: 2038

So, can we learn from the past? I sure hope so, because we know there are upcoming problems in the not-so-distant future: 2038. 32-bit POSIX epoch will overflow. Most systems have updated to use 64-bit integers already, but many low level systems that nobody ever thinks about have yet to be updated, such as file system drivers, data formats, and 32bit embedded systems still need to be updated. I'm not aware of any mass efforts to fix these things (yet), but I suggest we all get busy trying to fix these things now while we can, and doing our best to prevent further issues in the future as much as possible. If the past is any indication of the future, then we haven't learned our lesson yet. You know the saying, "History repeats itself". It is our flaw. Things will go wrong over and over until we learn our lesson. We are human, and we will always make mistakes, and there are always inherent limitations, whether they are known constraints, or low-level mundane intricacies that nobody thinks of until it's too late.

Embedded systems will be the biggest problem for the 2038 bug. You think "IOT" is bad now with all the malware and denial-of-service attacks? Wait until 2038 comes, maybe that will be what takes down skynet, haha! We have work to do, friends! Let's go!

Here are some examples of things you can do now to prevent pain later:

TZ Database Nerd Humor

I leave you with some notes from the maintainers of the time-zone database:

Shanks writes that Michigan started using standard time on 1885-09-18, but Howse writes (pp 124-125, referring to Popular Astronomy, 1901-01) that Detroit kept local time until 1900 when the City Council decreed that clocks should be put back twenty-eight minutes to Central Standard Time. Half the city obeyed, half refused. After considerable debate, the decision was rescinded and the city reverted to Sun time. A derisive offer to erect a sundial in front of the city hall was referred to the Committee on Sewers. Then, in 1905, Central time was adopted by city vote. Pam Belluck reported in the New York Times (2001-01-31) that the Indiana Legislature is considering a bill to adopt DST statewide. Her article mentioned Vevay, whose post office observes a different time-zone from Danner’s Hardware across the street. The most interesting region I have found consists of three towns on the southern coast of Australia, population 10 at last report, along with 50,000 sheep, about 100 kilometers long and 40 kilometers into the continent. The primary town is Madura, with the other towns being Mundrabilla and Eucla. According to the sheriff of Madura, the residents got tired of having to change the time so often, as they are located in a strip overlapping the border of South Australia and Western Australia. South Australia observes daylight saving time; Western Australia does not. The two states are one and a half hours apart. The residents decided to forget about this nonsense of changing the clock so much and set the local time 20 hours and 45 minutes from the international date line, or right in the middle of the time of South Australia and Western Australia.