How to do I groupby, count and then plot a bar chart in Pandas?

2024/10/5 23:27:59

I have a Pandas dataframe that looks like the following.

year  month  class
----  -----  -----
2015  1      1
2015  1      1
2015  1      2
2015  1      2
...

I want to be able to create 2 bar chart series of of this data on one plot. If I can do a groupby, count and end up with a data frame then I am thinking I can just do a simple dataframe.plot.barh.

What I have tried is the following code.

x = df.groupby(['year', 'month', 'class'])['class'].count()

What x ends up being is a Series. So then I do the following to get a DataFrame.

df = pd.DataFrame(x)

Which gets me pretty close. The data ends up looking like the following.

clazz
year month clazz        
2015 1     1            22     1           152     2           45

But when I do a bar plot df.plot.bar(), I only see one series. The output desired is simply in one series, from 2015-01 to 2019-12, how many times did class 1 occur per month? And then another series, from 2015-01 to 2019-12, how many times did class 2 occur per month?

Any ideas on how to manipulate the data to be in this way?

Answer

A groupby-unstack should do the trick:

Data

df = pd.DataFrame([[2015, 1, 1],[2015, 1, 1],[2015, 1, 2],[2015, 1, 2],[2015, 1, 2],[2015, 2, 1],[2015, 2, 1],[2015, 2, 1],[2015, 2, 2],[2015, 2, 2]], columns = ['year', 'month', 'class'])

Solution

df_gb = df.groupby(['year', 'month', 'class']).size().unstack(level=2)

Output

df_gb.plot(kind = 'bar')

enter image description here

https://en.xdnf.cn/q/70426.html

Related Q&A

How do I execute more code after closing a PyQt window?

Heres an example below:if __name__ == __main__:import sysif (sys.flags.interactive != 1) or not hasattr(QtCore, PYQT_VERSION):QtGui.QApplication.instance().exec_()print "you just closed the pyqt w…

Tor doesnt work with urllib2

I am trying to use tor for anonymous access through privoxy as a proxy using urllib2.System info: Ubuntu 14.04, recently upgraded from 13.10 through dist-upgrade.This is a piece of code I am using for …

Python Selenium Chrome disable prompt for Trying to download multiple files

I am currently running a Python automator which needs to download multiple files within the same session using Selenium Chromedriver.The problem is that when the browser attempts to download the second…

Label outliers in a boxplot - Python

I am analysing extreme weather events. My Dataframe is called df and looks like this:| Date | Qm | |------------|--------------| | 1993-01-…

Matplotlib how to draw vertical line between two Y points

I have 2 y points for each x points. I can draw the plot with this code:import matplotlib.pyplot as pltx = [0, 2, 4, 6] y = [(1, 5), (1, 3), (2, 4), (2, 7)]plt.plot(x, [i for (i,j) in y], rs, markersiz…

Cythonizing fails because of unknown type name uint64_t

This may be a newbie problem. I cant cythonize a simple helloworld.pyx tutorial script while the exact same code works on linux:print("hello world")Here is the setup.py script: from distutils…

How to save changes in read-only Jupyter Notebook

I have opened a python Jupyter notebook but did not notice that it was in read-only, Not Trusted mode. How to save my changes now?Things that I have tried and did not help:File -> Make a Copy File …

How can I invoke an SQLAlchemy query with limit of 1?

I have code like this:thing = thing.query.filter_by(id=thing_id).limit(1).all()[0]all()[0] feels a bit messy and redundant in the limit(1) case. Is there a more terse (and/or otherwise optimal) way to …

How to correctly create Python feature branch releases in development? (pip and PEP-440)

I develop a Python library using Gitflow development principle and have a CI stage for unit testing and package upload to a (private) PyPI. I want to consume the uploaded package for testing purposes b…

How do I replace NA with NaN in a Pandas DataFrame?

Some columns in my DataFrame have instances of <NA> which are of type pandas._libs.missing.NAType. Id like to replace them with NaN using np.nan. I have seen questions where the instances of <…