I think that this has to be a failure of pandas, having a pandas Series (v.18.1 and 19 too), if I assign a date to the Series, the first time it is added as int (error), the second time it is added as datetime(correct), I can not understand the reason.
For instance with this code:
import datetime as dt
import pandas as pd
series = pd.Series(list('abc'))
date = dt.datetime(2016, 10, 30, 0, 0)
series["Date_column"] =date
print("The date is {} and the type is {}".format(series["Date_column"], type(series["Date_column"])))
series["Date_column"] =date
print("The date is {} and the type is {}".format(series["Date_column"], type(series["Date_column"])))
The output is:
The date is 1477785600000000000 and the type is <class 'int'>
The date is 2016-10-30 00:00:00 and the type is <class 'datetime.datetime'>
As you can see, the first time it always sets the value as int instead of datetime.
could someone help me?,
Thank you very much in advance,
Javi.
The reason for this is that series is an 'object' type and the columns of a pandas DataFrame (or a Series) are homogeneously of type. You can inspect this with dtype (or DataFrame.dtypes):
series = pd.Series(list('abc'))
series
Out[3]:
0 a
1 b
2 c
dtype: objectIn [15]: date = dt.datetime(2016, 10, 30, 0, 0)
date
Out[15]: datetime.datetime(2016, 10, 30, 0, 0)In [18]: print(date)
2016-10-30 00:00:00In [17]: type(date)
Out[17]: datetime.datetimeIn [19]: series["Date_column"] = date
In [20]: seriesOut[20]:
0 a
1 b
2 c
Date_column 1477785600000000000
dtype: objectIn [22]: series.dtypeOut[22]: dtype('O')
Only the generic 'object' dtype can hold any python object (in your case inserting a datetime.datetime object into the Series).
Moreover, Pandas Series are based on Numpy Arrays, which are not mixed types and defeats the purpose of using the computational benefit of Pandas DataFrames and Series or Numpy.
Could you use a python list() instead? or a DataFrame()?