Error when plotting DataFrame containing NaN with Pandas 0.12.0 and Matplotlib 1.3.1 on Python 3.3.2

2024/7/27 9:24:53

First of all, this question is not the same as this one.

The problem I'm having is that when I try to plot a DataFrame which contains a numpy NaN in one cell, I get an error:

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
...                       columns=list('AB'))
>>>
>>> print(df.to_string())A   B
2013-12-01 00:00:00   1   2
2013-12-01 01:00:00   4   5
2013-12-01 02:00:00   9 NaN
2013-12-01 03:00:00  16  17
2013-12-01 04:00:00  25  26
>>> df.plot()
Traceback (most recent call last):File "<stdin>", line 1, in <module>File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1636, in plot_frameplot_obj.generate()File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generateself._make_plot()File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plotself._make_ts_plot(data, **self.kwds)File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1321, in _make_ts_plot_plot(data[col], i, ax, label, style, **kwds)File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plotstyle=style, **kwds)File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplotlines = plotf(ax, *args, **kwargs)File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plotfor line in self._get_lines(*args, **kwargs):File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_argsfor seg in self._plot_args(remaining, kwargs):File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_argsx, y = self._xy_from_xy(x, y)File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xyby = self.axes.yaxis.update_units(y)File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_unitsconverter = munits.registry.get_converter(data)File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converterxravel = x.ravel()File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravelr._mask = ndarray.ravel(self._mask).reshape(r.shape)File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshapereturn ndarray.reshape(self, newshape, order)
TypeError: an integer is required

The above code works if I replace the np.NaN with a number, such as "2.3".

Plotting as two separate Series does not work either (it fails when I add the Series containing the NaN to the plot):

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
...                       columns=list('AB'))
>>>
>>> print(df.to_string())A   B
2013-12-01 00:00:00   1   2
2013-12-01 01:00:00   4   5
2013-12-01 02:00:00   9 NaN
2013-12-01 03:00:00  16  17
2013-12-01 04:00:00  25  26
>>> df['A'].plot(label='This is A', style='k')
<matplotlib.axes.AxesSubplot object at 0x02ACFF90>
>>> df['B'].plot(label='This is B', style='g')
Traceback (most recent call last):File "<stdin>", line 1, in <module>File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1730, in plot_seriesplot_obj.generate()File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generateself._make_plot()File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plotself._make_ts_plot(data, **self.kwds)File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1311, in _make_ts_plot_plot(data, 0, ax, label, self.style, **kwds)File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plotstyle=style, **kwds)File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplotlines = plotf(ax, *args, **kwargs)File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plotfor line in self._get_lines(*args, **kwargs):File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_argsfor seg in self._plot_args(remaining, kwargs):File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_argsx, y = self._xy_from_xy(x, y)File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xyby = self.axes.yaxis.update_units(y)File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_unitsconverter = munits.registry.get_converter(data)File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converterxravel = x.ravel()File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravelr._mask = ndarray.ravel(self._mask).reshape(r.shape)File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshapereturn ndarray.reshape(self, newshape, order)
TypeError: an integer is required

However, if I do this directly with Matplotlib's Pyplot plot(), instead of using Pandas' plot() function, it works:

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> plt.plot(dates, [1, 4, 9, 16, 25], 'k', dates, [2, 5, np.NAN, 17, 26], 'g')
[<matplotlib.lines.Line2D object at 0x03E98650>, <matplotlib.lines.Line2D object at 0x040929B0>]
>>> plt.show()
>>>

So it seems that I have a workaround, but as I plot large DataFrames, I would prefer to use Pandas' plot() method, which is more convenient. I've tried to follow the stack trace, but after a while it gets complicated (I'm not familiar with Pandas, Numpy and Matplotlib source code). Am I doing something wrong, or is this a possible bug in Pandas' plot()?

Thank you for your help!

I tried both on Windows x86 and on Linux AMD64 with the same results with these versions:

  • Python 3.3.2
  • Pandas 0.12.0
  • Matplotlib 1.3.1
  • Numpy 1.7.1
Answer

It seems this is matplotlib 1.3.1 with pandas 0.12 integration bug:

The workaround is to downgrade to matplotlib 1.3.0. (Note, however, that this version of matplotlib contains a bug on systems which have fonts with non-ASCII font names, so you may need to pick your problem!). This downgrade will trigger a downgrade to numpy 1.7.1, so you should then (again) upgrade to numpy 1.8.0.This error should be fixed in the upcoming Pandas 0.13. However Pandas 0.13 may break some existing code (because pandas.Series is no longer a subclass of numpy.ndarray), so again, some hard choices may be required, at least in the short term.

Just checked, code works fine with matplotlib 1.3.0:

>>> import matplotlib
>>> matplotlib.__version__
'1.3.0'
>>> df.plot()
<matplotlib.axes.AxesSubplot object at 0x04E8B4F0>
>>> plt.show(_)

enter image description here

https://en.xdnf.cn/q/73007.html

Related Q&A

Java method which can provide the same output as Python method for HMAC-SHA256 in Hex

I am now trying to encode the string using HMAC-SHA256 using Java. The encoded string required to match another set of encoded string generated by Python using hmac.new(mySecret, myPolicy, hashlib.sha2…

How to get response from scrapy.Request without callback?

I want to send a request and wait for a response from the server in order to perform action-dependent actions. I write the followingresp = yield scrapy.Request(*kwargs)and got None in resp. In document…

install error thinks pythonpath is empty

I am trying to install the scikits.nufft package here I download the zip file, unpack and cd to the directory. It contains a setup.py file so I run python setup.py installbut it gives me the following …

Conditionally installing importlib on python2.6

I have a python library that has a dependency on importlib. importlib is in the standard library in Python 2.7, but is a third-party package for older pythons. I typically keep my dependencies in a pip…

Python/pandas: Find matching values from two dataframes and return third value

I have two different dataframes (df1, df2) with completely different shapes: df1: (64, 6); df2: (564, 9). df1 contains a column (df1.objectdesc) which has values (strings) that can also be found in a c…

random.choice broken with dicts

The random.choice input is supposed to be a sequence. This causes odd behavior with a dict, which is not a sequence type but can be subscripted like one: >>> d = {0: spam, 1: eggs, 3: potato} …

Tornado [Errno 24] Too many open files [duplicate]

This question already has an answer here:Tornado "error: [Errno 24] Too many open files" error(1 answer)Closed 9 years ago.We are running a Tornado 3.0 service on a RedHat OS and getting the …

How to check if an RGB image contains only one color?

Im using Python and PIL.I have images in RGB and I would like to know those who contain only one color (say #FF0000 for example) or a few very close colors (#FF0000 and #FF0001).I was thinking about us…

python requests and cx_freeze

I am trying to freeze a python app that depends on requests, but I am getting the following error:Traceback (most recent call last):File "c:\Python33\lib\site-packages\requests\packages\urllib3\ut…

django changing a date field to integer field cant migrate

I recently changed a date field to an integer field (the data was specified in number of months remaining rather than a date). However all though the make migrations command works fine when I attempt t…