Pandas to parquet file

2024/10/8 19:49:12

I am trying to save a pandas object to parquet with the following code:

LABL = datetime.now().strftime("%Y%m%d_%H%M%S")
df.to_parquet("/data/TargetData_Raw_{}.parquet".format(LABL))

this gives me the error:

ArrowTypeError: ("Expected bytes, got a 'float' object", 'Conversion failed for column Pre-Rumour_Date with type object')

The pandas dtypes are as follow:

0
Announced_Date                  object
Completed_Date                  object
Pre-Rumour_Date                 objectobject
Lapsed_Date                     object
Target_Company                  object
Bidder_Company                  object
Seller_Company                  object
Deal_Value_USD(_m)              object
Exit_Type                       object
Buy_Type                        object
Sell_Stake_(%)                  object
Buy_Stake_(%)                   object
Months_Held                     object
Private_Equity_House            object
ADATE                   datetime64[ns]
dtype: object
Answer

Try: df.astype(str).to_parquet("/data/TargetData_Raw_{}.parquet".format(LABL))

https://en.xdnf.cn/q/70102.html

Related Q&A

Wildcard in dictionary key

Suppose I have a dictionary:rank_dict = {V*: 1, A*: 2, V: 3,A: 4}As you can see, I have added a * to the end of one V. Whereas a 3 may be the value for just V, I want another key for V1, V2, V2234432, …

How to read emails from gmail?

I am trying to connect my gmail to python, but show me this error: I already checked my password, any idea what can be? b[AUTHENTICATIONFAILED] Invalid credentials (Failure) Traceback (most recent cal…

Python multiprocessing returning AttributeError when following documentation code [duplicate]

This question already has answers here:python multiprocessing in Jupyter on Windows: AttributeError: Cant get attribute "abc"(4 answers)Closed 4 years ago.I decided to try and get into the mu…

Python - How can I find if an item exists in multidimensional array?

Ive tried a few approaches, none of which seem to work for me. board = [[0,0,0,0],[0,0,0,0]]if not 0 in board:# the board is "full"I then tried:if not 0 in board[0] or not 0 in board[1]:# the…

Convert Geo json with nested lists to pandas dataframe

Ive a massive geo json in this form:{features: [{properties: {MARKET: Albany,geometry: {coordinates: [[[-74.264948, 42.419877, 0],[-74.262041, 42.425856, 0],[-74.261175, 42.427631, 0],[-74.260384, 42.4…

Pymongo - ValueError: NaTType does not support utcoffset when using insert_many

I am trying to incrementally copy documents from one database to another. Some fields contain date time values in the following format:2016-09-22 00:00:00while others are in this format:2016-09-27 09:0…

python numpy argmax to max in multidimensional array

I have the following code:import numpy as np sample = np.random.random((10,10,3)) argmax_indices = np.argmax(sample, axis=2)i.e. I take the argmax along axis=2 and it gives me a (10,10) matrix. Now, I …

Can Keras model.predict return a dictionary?

The documentation https://keras.io/models/model/#predict says that model.predict returns Numpy array(s) of predictions. In the Keras API, is there is a way to distinguishing which of these arrays are…

Flask OIDC: oauth2client.client.FlowExchangeError

The Problem: The library flask-oidc includes the scope parameter into the authorization-code/access-token exchange request, which unsurprisingly throws the following error:oauth2client.client.FlowExcha…

Cumulative count at a group level Python

I have a pandas dataframe like this : df = pd.DataFrame([[A, 1234, 20120201],[A, 1134, 20120201],[A, 1011, 20120201],[A, 1123, 20121004],[A, 1111, 20121004],[A, 1224, 20121105],[B, 1156, 20120403],[B, …