combine rows and add up value in dataframe

2024/11/15 15:33:43

I got a dataframe(named table) with 6 columns labeled as [price1,price2,price3,time,type,volume]

for type, I got 'Q' and 'T', arranged like:

Q

T

Q

T

T

Q

Now I want to combine the rows with consecutive T and add up the value of volume. The value of prices and time are the same for consecutive Ts

i.e. I want

Price...: Time: Type: Volume:

10000 2012.05 Q 10

10000 2012.05 T 20

10000 2012.05 Q 10

10000 2012.06 T 20

10000 2012.06 T 30

10000 2012.07 Q 10

to be:

10000 2012.05 Q 10

10000 2012.05 T 20

10000 2012.05 Q 10

10000 2012.06 T 20+30=50

10000 2012.07 Q 10

here is my code but does not return the desired result, so can someone please help me to figure out my mistake?

    def combine(df):combined = [] # Init empty listlength = len(df.iloc[:,0]) # Get the number of rows in DataFramei = 0while i < length:num_elements = num_elements_equal(df, i, 0, 'T') # Get the number of consecutive 'T'sif num_elements <= 1: # If there are 1 or less T's, append only that element to combined, with the same typecombined.append([df.iloc[i,0],df.iloc[i,1],df.iloc[i,2],df.iloc[i,3],df.iloc[i,4],df.iloc[i,5]])else: # Otherwise, append the sum of all the elements to combined, with 'T' typecombined.append(['T', sum_elements(df, i, i+num_elements, 5)])i += max(num_elements, 1) # Increment i by the number of elements combined, with a min increment of 1return pd.DataFrame(combined, columns=df.columns) # Return as DataFramedef num_elements_equal(df, start, column, value): # Counts the number of consecutive elementsi = startnum = 0while i < len(df.iloc[:,column]):if df.iloc[i,column] == value:num += 1i += 1else:return numreturn numdef sum_elements(df, start, end, column): # Sums the elements from start to endreturn sum(df.iloc[start:end, column])tableT = combine(table)
tableT

raw data (Table) looks like this

Answer

IIUC:

Input dataframe, df:

   Price     Time Type  Volume
0  10000  2012.05    Q      10
1  10000  2012.05    T      20
2  10000  2012.05    Q      10
3  10000  2012.06    T      20
4  10000  2012.06    T      30
5  10000  2012.07    Q      10

Combine T records and sum volume:

df.groupby(by=[df.Type.ne('T').cumsum(),'Price','Time','Type'], as_index=False)['Volume'].sum()

Output:

   Price     Time Type  Volume
0  10000  2012.05    Q      10
1  10000  2012.05    T      20
2  10000  2012.05    Q      10
3  10000  2012.06    T      50
4  10000  2012.07    Q      10
https://en.xdnf.cn/q/120433.html

Related Q&A

How to access a part of an element from a list?

import cv2 import os import glob import pandas as pd from pylibdmtx import pylibdmtx import xlsxwriter# co de for scanningimg_dir = "C:\\images" # Enter Directory of all images data_path = os…

How to get invisible data from website with BeautifulSoup

I need fiverr service delivery times but I could get just first packages(Basic) delivery time. How can I get second and third packages delivery time? Is there any chance I can get it without using Sel…

How to get rid of \n and in my json file

thanks for reading I am creating a json file as a result of an API that I am using. My issue is that the outcome gets has \h and in it and a .json file does not process the \n but keeps them, so the f…

Python code to calculate the maximal amount of baggage is allowed using recursive function

I am new to python and I have an assignment, I need to write a recursive function that takes two arguments (Weights, W), weights is the list of weights of baggage and W is the maximal weight a student …

How to flatten a nested dictionary? [duplicate]

This question already has answers here:Flatten nested dictionaries, compressing keys(32 answers)Closed 10 years ago.Is there a native function to flatten a nested dictionary to an output dictionary whe…

Find an element in a list of tuples in python [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 9 years ago.Improve…

print dictionary values which are inside a list in python

I am trying to print out just the dict values inside a list in python.car_object = {}cursor = self._db.execute(SELECT IDENT, MAKE, MODEL, DISPLACEMENT, POWER, LUXURY FROM CARS)for row in cursor:objectn…

Triangle of numbers on Python

Im asked to write a loop system that prints the following:0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 0 1 2 3 4 5 0 1 2 3 4 0 1 2 3 0 1 2 0 1 0However, my script prints this:0 1…

Using regex to ignore invalid syntax [closed]

Closed. This question needs debugging details. It is not currently accepting answers.Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to repro…

What is the requirements.txt file? What should be in it in this particular case?

I am switching from Replit to PebbleHost to host my Python bot. What do I put in my requirements.txt file? These are the imports that I have at the start of my bot. import asyncio import datetime impo…