I am trying to convert a column in string format to DateTime format, However, I am getting the following error, could somebody please help?
The error:time data '42:53.700' does not match format '%H:%M:%S.%f' (match)
Code:
Merge_df['Time'] = pd.to_datetime(Merge_df['Time'], format='%H:%M:%S.%f')
You'll need to clean the data to get a common format before you can parse to data type 'datetime'. For example you can remove the colons and fill with zeros, then parse with the appropriate directive:
import pandas as pddf = pd.DataFrame({'time': ["1:45.333", "45:22.394", "4:55:23.444", "23:44:01.004"]})df['time'] = pd.to_datetime(df['time'].str.replace(':', '').str.zfill(10), format="%H%M%S.%f")df['time']
0 1900-01-01 00:01:45.333
1 1900-01-01 00:45:22.394
2 1900-01-01 04:55:23.444
3 1900-01-01 23:44:01.004
Name: time, dtype: datetime64[ns]
Since the data actually looks more like a duration to me, here's a way how to convert to data type 'timedelta'. You'll need to ensure HH:MM:SS.fff format which is a bit more work:
# ensure common string length
df['time'] = df['time'].str.zfill(12)
# ensure HH:MM:SS.fff format
df['time'] = df['time'].str[:2] + ":" + df['time'].str[3:5] + ":" + df['time'].str[6:]df['timedelta'] = pd.to_timedelta(df['time'])df['timedelta']
0 0 days 00:01:45.333000
1 0 days 00:45:22.394000
2 0 days 04:55:23.444000
3 0 days 23:44:01.004000
Name: timedelta, dtype: timedelta64[ns]
The advantage of using timedelta is that you can now also handle hours greater 23.