polars dataframe TypeError: must be real number, not str

2024/9/22 10:40:01

so bascially i changed panda.frame to polars.frame for better speed in yolov5 but when i run the code, it works fine till some point (i dont exactly know when error occurs) and it gives me TypeError: must be real number, not str. running it with panda works great without any errors but only with polars. i know it must be using wrong type of data type but i dont really know where i should look for since i've just started python. so i would really appreciate it if someone could help me with this! thanx for reading and have a nice day!

Traceback (most recent call last):File "C:\yolov5\test.py", line 61, in <module>boxes = results.polars().xywh[0]File "c:\yolov5\.\models\common.py", line 684, in polarssetattr(new, k, [pl.DataFrame(x, columns=c) for x in a])File "c:\yolov5\.\models\common.py", line 684, in <listcomp>setattr(new, k, [pl.DataFrame(x, columns=c) for x in a])File "C:\Users\jojow\AppData\Local\Programs\Python\Python39\lib\site- packages\polars\internals\frame.py", line 311, in __init__
self._df = sequence_to_pydf(data, columns=columns, orient=orient)File 
packages\polars\internals\construction.py", line 495, in 
data_series = [
"C:\Users\jojow\AppData\Local\Programs\Python\Python39\lib\site- packages\polars\internals\construction.py", line 496, in <listcomp>
pli.Series(columns[i], data[i], dtypes.get(columns[i])).inner()File 
"C:\Users\jojow\AppData\Local\Programs\Python\Python39\lib\site- packages\polars\internals\series.py", line 227, in __init__
self._s = sequence_to_pyseries(name, values, dtype=dtype, 
strict=strict)File "C:\Users\jojow\AppData\Local\Programs\Python\Python39\lib\site- 
packages\polars\internals\construction.py", line 239, in 
return constructor(name, values, strict)
TypeError: must be real number, not str

heres my code (edited)

import polars as pl 
import pandas as pdclass new:xyxy = 0a = [[[370.01605224609375, 346.4305114746094, 398.3968811035156, 
384.5684814453125, 0.9011853933334351, 0, 'corn'], 
[415.436767578125, 279.4227294921875, 433.930419921875, 
305.5151672363281, 0.8829901814460754, 0, 'corn'], 
[383.8118896484375, 268.781494140625, 402.35479736328125, 
292.4585266113281, 0.8579609394073486, 0, 'corn'], 
[431.42791748046875, 570.9154663085938, 476.672119140625, 600.0, 
0.810459554195404, 0, 'corn'], [414.912841796875, 
257.7676086425781, 427.7708740234375, 274.69635009765625,
0.7384995818138123, 0, 'corn'], [391.22821044921875, 
250.48876953125, 403.9199523925781, 268.1374816894531, 
0.6828912496566772, 0, 'corn'], [414.2362060546875, 
250.18174743652344, 423.82537841796875, 264.02667236328125, 
0.517136812210083, 0, 'corn']]]ca = 'xmin', 'ymin', 'xmax', 'ymax', 'confidence', 'class', 'name'  # xyxy columns
cb = 'xcenter', 'ycenter', 'width', 'height', 'confidence', 'class', 'name'  # xywh columnsfor k, c in zip(['xyxy', 'xyxyn', 'xywh', 'xywhn'], [ca, ca, cb, 
cb]):setattr(new, k, [pl.DataFrame(x, columns=c) for x in a])print (new.xyxy[0])

From the information you provided, I can only provide a hint as to where to look.

Near the end of your code, you are creating a list of new DataFrame

setattr(new, k, [polars.DataFrame(x, columns=c) for x in a])

And the error is caused by this call:

polars.DataFrame(x, columns=c)

What is occurring is that one of your lists that is being passed in (x) to one of these DataFrames has a mix of numbers and strings. More specifically, one of those lists starts with one or more numbers, but contains a string somewhere after that. And this is causing an error as Polars tries to make a column of numbers from that list.

An Example

Let's take a closer look. Here is an example of creating a DataFrame:

import polars as pl
pl.DataFrame([["one", "two", "three"], [1.0, 2.0, 3.0]],columns=["col1", "col2"])

Notice that ["one", "two", "three"] are all strings. And [1.0, 2.0, 3.0] are all numbers. So, in each column, we have data of only one type. And we get no errors...

shape: (3, 2)
│ col1  ┆ col2 │
│ ---   ┆ ---  │
│ str   ┆ f64  │
│ one   ┆ 1.0  │
│ two   ┆ 2.0  │
│ three ┆ 3.0  │

Now let's see what happens when we accidentally mix a string in with the column of numbers:

pl.DataFrame([["one", "two", "three"], [1.0, 2.0, "Oops, this is a string mixed in with numbers"]],columns=["col1", "col2"])

We get an error...

Traceback (most recent call last):File "<stdin>", line 1, in <module>File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/frame.py", line 311, in __init__self._df = sequence_to_pydf(data, columns=columns, orient=orient)File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 495, in sequence_to_pydfdata_series = [File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 496, in <listcomp>pli.Series(columns[i], data[i], dtypes.get(columns[i])).inner()File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/series.py", line 227, in __init__self._s = sequence_to_pyseries(name, values, dtype=dtype, strict=strict)File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 239, in sequence_to_pyseriesreturn constructor(name, values, strict)
TypeError: must be real number, not str

Compare this error message with the one you received. They match closely (except for the directories, which are specific to each computer).

So, you need to look for a list that starts with one or more numbers, but contains a string. Polars tries to make a column of numbers with this list, and throws an error.

Perhaps one or more elements in a list that is meant to be numbers contain a string such as "Error" or "NULL" or "#N/A" or something similar.

You'll have to debug this to find out.


