customize dateutil.parser century inference logic

2024/10/3 17:15:03

I am working on old text files with 2-digit years where the default century logic in dateutil.parser doesn't seem to work well. For example, the attack on Pearl Harbor was not on dparser.parse("12/7/41") (which returns 2041-12-7).

The buit-in century "threshold" to roll back into the 1900's seems to happen at 66:

import dateutil.parser as dparser
print(dparser.parse("12/31/65")) # goes forward to 2065-12-31 00:00:00
print(dparser.parse("1/1/66")) # goes back to 1966-01-01 00:00:00

For my purposes I would like to set this "threshold" at 17, so that:

  • "12/31/16" parses to 2016-12-31 (yyyy-mm-dd)
  • "1/1/17" parses to 1917-01-01

But I would like to continue to use this module as its fuzzy match seems to be working well.

The documentation doesn't identify a parameter for doing this... is there an argument I'm overlooking?

Answer

This isn't particularly well documented but you can actually override this using dateutil.parser. The second argument is a parserinfo object, and the method you'll be concerned with is convertyear. The default implementation is what's causing you problems. You can see that it is basing its interpretation of the century on the current year, plus or minus fifty years. That's why you're seeing the transition at 1966. Next year it will be 1967. :)

Since you are using this personally and may have very specific needs, you don't have to be super-generic. You could do something as simple as this if it works for you:

from dateutil.parser import parse, parserinfoclass MyParserInfo(parserinfo):def convertyear(self, year, *args, **kwargs):if year < 100:year += 1900return yearparse('1/21/47', MyParserInfo())
# datetime.datetime(1947, 1, 21, 0, 0)
https://en.xdnf.cn/q/70702.html

Related Q&A

How can I check a Python unicode string to see that it *actually* is proper Unicode?

So I have this page:http://hub.iis.sinica.edu.tw/cytoHubba/Apparently its all kinds of messed up, as it gets decoded properly but when I try to save it in postgres I get:DatabaseError: invalid byte seq…

Test assertions for tuples with floats

I have a function that returns a tuple that, among others, contains a float value. Usually I use assertAlmostEquals to compare those, but this does not work with tuples. Also, the tuple contains other …

Django: Assigning ForeignKey - Unable to get repr for class

I ask this question here because, in my searches, this error has been generally related to queries rather than ForeignKey assignment.The error I am getting occurs in a method of a model. Here is the co…

Counting day-of-week-hour pairs between two dates

Consider the following list of day-of-week-hour pairs in 24H format:{Mon: [9,23],Thu: [12, 13, 14],Tue: [11, 12, 14],Wed: [11, 12, 13, 14]Fri: [13],Sat: [],Sun: [], }and two time points, e.g.:Start:dat…

Download A Single File Using Multiple Threads

Im trying to create a Download Manager for Linux that lets me download one single file using multiple threads. This is what Im trying to do : Divide the file to be downloaded into different parts by sp…

Merge string tensors in TensorFlow

I work with a lot of dtype="str" data. Ive been trying to build a simple graph as in https://www.tensorflow.org/versions/master/api_docs/python/train.html#SummaryWriter. For a simple operat…

How to reduce memory usage of threaded python code?

I wrote about 50 classes that I use to connect and work with websites using mechanize and threading. They all work concurrently, but they dont depend on each other. So that means 1 class - 1 website - …

Connection is closed when a SQLAlchemy event triggers a Celery task

When one of my unit tests deletes a SQLAlchemy object, the object triggers an after_delete event which triggers a Celery task to delete a file from the drive.The task is CELERY_ALWAYS_EAGER = True when…

Python escape sequence \N{name} not working as per definition

I am trying to print unicode characters given their name as follows:# -*- coding: utf-8 -*- print "\N{SOLIDUS}" print "\N{BLACK SPADE SUIT}"However the output I get is not very enco…

Binary integer programming with PULP using vector syntax for variables?

New to the python library PULP and Im finding the documentation somewhat unhelpful, as it does not include examples using lists of variables. Ive tried to create an absolutely minimalist example below …