How to sum all amounts by date in pandas dataframe?

2024/10/4 13:21:33

I have dataframe with fields last_payout and amount. I need to sum all amount for each month and plot the output.

df[['last_payout','amount']].dtypeslast_payout    datetime64[ns]
amount           float64
dtype: object

-

df[['last_payout','amount']].head<bound method NDFrame.head of                last_payout  amount
0      2017-02-14 11:00:06          23401.0
1      2017-02-14 11:00:06          1444.0
2      2017-02-14 11:00:06          0.0
3      2017-02-14 11:00:06          0.0
4      2017-02-14 11:00:06          290083.0

I used the code from jezrael's answer to plot the number of transactions per month.

(df.loc[df['last_payout'].dt.year.between(2016, 2017), 'last_payout'].dt.to_period('M').value_counts().sort_index().plot(kind="bar")
)

Number of transactions per month:

Number of transactions per month

How do I sum all amount for each month and plot the output? How should I extend the code above for doing this?

I tried to implement .sum but didn't succeed.

Answer

PeriodIndex solution:

groupby by month period by to_period and aggregate sum:

df['amount'].groupby(df['last_payout'].dt.to_period('M')).sum().plot(kind='bar')

DatetimeIndex solutions:

Use resample by months (M) or starts of months (MS) with aggregate sum:

s = df.resample('M', on='last_payout')['amount'].sum()
#alternative
#s = df.groupby(pd.Grouper(freq='M', key='last_payout'))['amount'].sum()
print (s)
last_payout
2017-02-28     23401.0
2017-03-31      1444.0
2017-04-30    290083.0
Freq: M, Name: amount, dtype: float64

Or:

s = df.resample('MS', on='last_payout')['amount'].sum()
#s = df.groupby(pd.Grouper(freq='MS', key='last_payout'))['amount'].sum()
print (s)
last_payout
2017-02-01     23401.0
2017-03-01      1444.0
2017-04-01    290083.0
Freq: MS, Name: amount, dtype: float64

Then is necessary format x labels:

ax = s.plot(kind='bar')
ax.set_xticklabels(s.index.strftime('%Y-%m'))

graph

Setup:

import pandas as pdtemp=u"""last_payout,amount
2017-02-14 11:00:06,23401.0
2017-03-14 11:00:06,1444.0
2017-03-14 11:00:06,0.0
2017-04-14 11:00:06,0.0
2017-04-14 11:00:06,290083.0"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), parse_dates=[0])
print (df)last_payout    amount
0 2017-02-14 11:00:06   23401.0
1 2017-03-14 11:00:06    1444.0
2 2017-03-14 11:00:06       0.0
3 2017-04-14 11:00:06       0.0
4 2017-04-14 11:00:06  290083.0
https://en.xdnf.cn/q/70602.html

Related Q&A

Unable to import decimal in Python 2.7 or Python 3.3 [duplicate]

This question already has answers here:Importing a library from (or near) a script with the same name raises "AttributeError: module has no attribute" or an ImportError or NameError(4 answers…

I Get ImportError: No module named pathlib, even after installing pathlib with pip

This is my first time asking on this site, so sorry if my question is not layed out correctlyy@DESKTOP-MQJ3NCT:~/Real-Time-Voice-Cloning$ python demo_toolbox.py Traceback (most recent call last):File &…

Python regex separate space-delimited words into a list

If I have a string = "hello world sample text"I want to be able to convert it to a list = ["hello", "world", "sample", "text"]How can I do that with re…

Naive install of PySpark to also support S3 access

I would like to read Parquet data stored on S3 from PySpark.Ive downloaded spark from here:http://www.apache.org/dist/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgzAnd installed it to Python naivelycd…

Multiprocessing Pool hangs if child process killed

I launched a pool of worker processes and submitted a bunch of tasks. The system ran low on memory and the oomkiller killed one of the worker processes. The parent process just hung there waiting for t…

What does sys.maxunicode mean?

CPython stores unicode strings as either utf-16 or utf-32 internally depending on compile options. In utf-16 builds of Python string slicing, iteration, and len seem to work on code units, not code po…

How to detect dialogs close event?

Hi everyone.I am making a GUI application using python3.4, PyQt5 in windows 7. Application is very sample. User clicks a main windows button, information dialog pops up. And when a user clicks informat…

How to Make a Portable Jupyter Slideshow

How do I make a Jupyter slide show portable? I can serve the slideshow locally, but I cant send that to anyone and have it work with all the images, slide animation functionality, etc. I am using jupy…

How to animate a bar char being updated in Python

I want to create an animated, stacked bar chart.There is a great tutorial, which shows how to animate line graphs.However, for animating bar charts, the BarContainer object, does not have any attribute…

Add text to end of line without loading file

I need to store information into a very big file, in form of many dictionaries. Thats not so important, is just to say that I tried to first get all the data into these dictionaries and I run out of me…