Why does Pandas coerce my numpy float32 to float64?

2024/10/11 16:22:56

Why does Pandas coerce my numpy float32 to float64 in this piece of code:

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame([[1, 2, 'a'], [3, 4, 'b']], dtype=np.float32)
>>> A = df.ix[:, 0:1].values
>>> df.ix[:, 0:1] = A
>>> df[0].dtype
dtype('float64')

The behavior seems so odd to me that wonder if it is a bug. I am on Pandas version 0.17.1 (updated PyPI version) and I note there has been coercing bugs recently addressed, see https://github.com/pydata/pandas/issues/11847 . I haven't tried the piece of code with an updated GitHub master.

Is it a bug or do I misunderstand some "feature" in Pandas? If it is a feature, then how do I get around it?

(The coercing problem relates to a question I recently asked about the performance of Pandas assignments: Assignment of Pandas DataFrame with float32 and float64 slow)

Answer

I think it is worth posting this as a GitHub issue. The behavior is certainly inconsistent.

The code takes a different branch based on whether the DataFrame is mixed-type or not (source).

  • In the mixed-type case the ndarray is converted to a Python list of float64 numbers and then converted back into float64 ndarray disregarding the DataFrame's dtypes information (function maybe_convert_objects()).

  • In the non-mixed-type case the DataFrame content is updated pretty much directly (source) and the DataFrame keeps its float32 dtypes.

https://en.xdnf.cn/q/69752.html

Related Q&A

Conda and Python Modules

Sadly, I do not understand how to install random python modules for use within iPython Notebooks with my Anaconda distribution. The issue is compounded by the fact that I need to be able to do these t…

WeakValueDictionary retaining reference to object with no more strong references

>>> from weakref import WeakValueDictionary >>> class Foo(object): ... pass >>> foo = Foo() >>> db = WeakValueDictionary() >>> db[foo-id] = foo >>…

Using pretrained glove word embedding with scikit-learn

I have used keras to use pre-trained word embeddings but I am not quite sure how to do it on scikit-learn model.I need to do this in sklearn as well because I am using vecstack to ensemble both keras s…

Is there an easy way to tell how much time is spent waiting for the Python GIL?

I have a long-running Python service and Id like to know how much cumulative wall clock time has been spent by any runnable threads (i.e., threads that werent blocked for some other reason) waiting for…

Inverse filtering using Python

Given an impulse response h and output y (both one-dimensional arrays), Im trying to find a way to compute the inverse filter x such that h * x = y, where * denotes the convolution product.For example,…

Quadruple Precision Eigenvalues, Eigenvectors and Matrix Logarithms

I am attempting to diagonalize matrices in quadruple precision, and to take their logarithms. Is there a language in which I can accomplish this using built-in functions?Note, the languages/packages i…

How to use pyinstaller with pipenv / pyenv

I am trying to ship an executable from my python script which lives inside a virtual environment using pipenv which again relies on pyenv for python versioning. For that, I want to us pyinstaller. Wha…

Sending DHCP Discover using python scapy

I am new to python and learning some network programming, I wish to send an DHCP Packet through my tap interface to my DHCP server and expecting some response from it. I tried with several packet build…

cnf argument for tkinter widgets

So, Im digging through the code here and in every class (almost) I see an argument cnf={} to the constructor, but unless Ive missed it, it is not explicitly stated what cnf is / expected to contain. Ca…

python exceptions.UnicodeDecodeError: ascii codec cant decode byte 0xa7 in

I am using scrapy with python and I have this code in a python item piplinedef process_item(self, item, spider):import pdb; pdb.set_trace()ID = str(uuid.uuid5(uuid.NAMESPACE_DNS, item[link]))I got this…