fuzzy parsing fails when string contains 'a' without hour
Bug #1428895 reported by
Minqi Jiang
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
dateutil |
New
|
Undecided
|
Unassigned |
Bug Description
Fuzzy parsing fails when string contains 'a' without hour, as in
'A monkey on 10/08/1990'
'I have a meeting on March 1'
These sentences will raise a ValueError when parsed using the method
dateutil.
because the private _parse method in parse.py raises a ValueError if 'a' is detected (for AM) without a corresponding hour. This is not an issue in python-dateutil versions < 2.4.
To post a comment you must log in.
I have migrated this issue over to the github issue tracker: https:/ /github. com/dateutil/ dateutil/ issues/ 56
You can see my response there, but I'll copy it over here:
This problem was almost certainly introduced in PR #22. It's easy enough to patch over (just don't check whether or not there are hours if the fuzzy flag is enabled), but it does raise a bit of a problem which is that the AM/PM flag is being erroneously set anytime an "a" shows up in a fuzzy parsing situation.
Personally, I think a lot of these parser issues are coming from the over-broad / under-specified nature of the parsing. It's very hard to make any changes without removing support for some weird edge case. The main reason I haven't closed issue #26 yet is that I'd like to add an enumeration of the supported formats so that we can explicitly detect those patterns. The fact of the matter is that no well-formatted date uses "a" to mean "AM" in any context except immediately following a number, except if you are parsing some slang-y fuzzy date string like, "At 10 in the a"
For now I'll just add in the kludge-y patch since even though it's just putting a band-aid over this particular problem, it's not doing any long-term harm.