How to do time diff in each group on Pandas in Python

2024/10/13 7:25:11

Here's the phony data:

df = pd.DataFrame({'email': ['u1','u1','u1','u2','u2','u2'],'timestamp': [3, 1, 5, 11, 15, 9]})

What I intend to retrieve is the time diff in each group of email. Thus, after sorting by timestamp in each group, the data should be:

u1  5
u1  3
u1  1
u2  15
u2  11
u2  9

the result should be:

u1  2  # 5-3
u1  2  # 3-1
u2  4  # 15-11
u2  2  # 11-9

Could anyone tell me what I should do next? Great thanks.

Answer
df = pd.DataFrame({'email': ['u1','u1','u1','u2','u2','u2'],'timestamp': [3, 1, 5, 11, 15, 9]})(df.sort_values(['email', 'timestamp'], ascending=[True, False]).groupby('email')['timestamp'].diff(-1).dropna())
Out: 
2    2.0
0    2.0
4    4.0
3    2.0
Name: timestamp, dtype: float64

To keep the email column:

df.sort_values(['email', 'timestamp'], ascending=[True, False], inplace=True)
df.assign(diff=df.groupby('email')['timestamp'].diff(-1)).dropna()
Out: email  timestamp  diff
2    u1          5   2.0
0    u1          3   2.0
4    u2         15   4.0
3    u2         11   2.0

If you don't want the timestamp column you can add .drop('timestamp', axis=1) at the end.

https://en.xdnf.cn/q/118109.html

Related Q&A

How to copy contents of a subdirectory in python

I am newbie to python, I am trying to achieve following task-I have a directory WP_Test containing a sub-directory test, I want to copy all the files and folders inside this sub-directory test to anoth…

Facing issue while providing dynamic name to file in python through a function

the line : with open(new%s.txt % intg ,a) as g : is giving error in below code. Every time I call the function "Repeat", it should create file with name new1.txt, new2.txt and so on. But it …

Python Pandas: Merging data frames on multiple conditions

I wish to merge data frames as fetched via sql under multiple condition. df1: First df contains Customer ID, Cluster ID and Customer Zone ID. The second df contain complain ID, registration number.…

counterpart to PILs Image.paste in PHP

I was asked to port a Python application to PHP (and Im not very fond of PHP).The part Im having trouble to port uses a set of monochromatic "template" images based on the wonderful Map Icons…

Google Cloud Dataflow fails in combine function due to worker losing contact

My Dataflow consistently fails in my combine function with no errors reported in the logs beyond a single entry of:A work item was attempted 4 times without success. Each time the worker eventually los…

AttributeError: super object has no attribute __getattr__

Ive been searching for the solution of this problem over the all internet but I still cant find the right solution. There are lots of generic answers but none of those have solved my problem..I am tryi…

Selenium load time errors - looking for possible workaround

I am trying to data scrape from a certain website. I am using Selenium so that I can log myself in, and then start parsing through data. I have 3 main errors:Last page # not loading properly. here I am…

How to POST ndb.StructuredProperty?

Problem:I have following EndpointsModels,class Role(EndpointsModel):label = ndb.StringProperty()level = ndb.IntegerProperty()class Application(EndpointsModel):created = ndb.DateTimeProperty(auto_now_ad…

Issue computing difference between two csv files

Im trying to obtain the difference between two csv files A.csv and B.csv in order to obtain new rows added in the second file. A.csv has the following data.acct ABC 88888888 99999999 ABC-G…

How do I display an extremly long image in Tkinter? (how to get around canvas max limit)

Ive tried multiple ways of displaying large images with tkinterreally long image No matter what Ive tried, there doesnt seem to be any code that works. The main issue is that Canvas has a maximum heigh…