pytz

Bug #1319939
Comment #12

Comment 12 for bug 1319939

Revision history for this message

Stuart Bishop (stub) wrote on 2014-11-27:

#12

This is not a regression. Pytz has always behaved this way. If the recent changes to the data broke things, it indicates incorrect usage and a lurking bug was triggered.

These lurking bugs are why I have always been nervous of changing the behaviour. It was obvious in most timezones that something is wrong when your timestamps are reporting a timezone from 100 years ago.

I can change the default. It does however mean that bugs like the following are less likely to be noticed, silently corrupting data:

>>> tz = pytz.timezone('Australia/Melbourne')
>>> dt = datetime(2014, 04, 10, 13, 59, tzinfo=tz)

In this case, I have constructed a 7 month old timestamp using a todays timezone definition. Exactly the sort of thing you would do when writing a log parser. It incorrectly reports daylight savings time:

>>> dt.strftime('%Y-%m-%d %H:%M:%S %Z%z')
'2014-04-10 13:59:00 AEDT+1100'

If we normalize, we get the correct timezone displayed but the time has been wound back one hour:

>>> tz.normalize(dt).strftime('%Y-%m-%d %H:%M:%S %Z%z')
'2014-04-10 12:59:00 AEST+1000'

Even if you are dealing with 'now' rather than older timestamps, there is still a race condition. The timezone instance is generated before the datetime instance, so even "datetime.now(pytz.timezone('Australia/Melbourne'))" could give you a timestamp out by one hour if you were unlucky and the DST transition occurred while that statement was being run.

So, yes, using 'now' for the default timezone information is less of a surprise, but it might not be smarter as a surprise could lead you to the problem. It has already led many people to the mailing lists or bug trackers, and certainly many more to search results.

All that said, I'm still open to changing this 10 year old behaviour. The rationale of helping people notice bugs seems dubious now, and I can't stop people from shooting themselves in the foot when they think the issues don't affect their use case.

If I have time, I may investigate if making datetime(..., tzinfo=foo) work as expected, and remove the need for localize and normalize entirely. The last time I looked, I decided it was impossible with the current Python datetime implementation but that was several years ago. If not, all this weirdness will be sorted when pytz is integrated with Python 3 (first step, find a volunteer to modify some C code).

This is not a regression. Pytz has always behaved this way. If the recent changes to the data broke things, it indicates incorrect usage and a lurking bug was triggered.

I can change the default. It does however mean that bugs like the following are less likely to be noticed, silently corrupting data:

>>> tz = pytz.timezone('Australia/Melbourne')
>>> dt = datetime(2014, 04, 10, 13, 59, tzinfo=tz)

>>> dt.strftime('%Y-%m-%d %H:%M:%S %Z%z')
'2014-04-10 13:59:00 AEDT+1100'

If we normalize, we get the correct timezone displayed but the time has been wound back one hour:

>>> tz.normalize(dt).strftime('%Y-%m-%d %H:%M:%S %Z%z')
'2014-04-10 12:59:00 AEST+1000'

Even if you are dealing with 'now' rather than older timestamps, there is still a race condition. The timezone instance is generated  before the datetime instance, so even "datetime.now(pytz.timezone('Australia/Melbourne'))" could give you a timestamp out by one hour if you were unlucky and the DST transition occurred while that statement was being run.

So, yes, using 'now' for the default timezone information is less of a surprise, but it might not be smarter as a surprise could lead you to the problem. It has already led many  people to the mailing lists or bug trackers, and certainly many more to search results.