I got a dataframe(named table) with 6 columns labeled as [price1,price2,price3,time,type,volume]
for type, I got 'Q' and 'T', arranged like:
Q
T
Q
T
T
Q
Now I want to combine the rows with consecutive T and add up the value of volume. The value of prices and time are the same for consecutive Ts
i.e. I want
Price...: Time: Type: Volume:
10000 2012.05 Q 10
10000 2012.05 T 20
10000 2012.05 Q 10
10000 2012.06 T 20
10000 2012.06 T 30
10000 2012.07 Q 10
to be:
10000 2012.05 Q 10
10000 2012.05 T 20
10000 2012.05 Q 10
10000 2012.06 T 20+30=50
10000 2012.07 Q 10
here is my code but does not return the desired result, so can someone please help me to figure out my mistake?
def combine(df):combined = [] # Init empty listlength = len(df.iloc[:,0]) # Get the number of rows in DataFramei = 0while i < length:num_elements = num_elements_equal(df, i, 0, 'T') # Get the number of consecutive 'T'sif num_elements <= 1: # If there are 1 or less T's, append only that element to combined, with the same typecombined.append([df.iloc[i,0],df.iloc[i,1],df.iloc[i,2],df.iloc[i,3],df.iloc[i,4],df.iloc[i,5]])else: # Otherwise, append the sum of all the elements to combined, with 'T' typecombined.append(['T', sum_elements(df, i, i+num_elements, 5)])i += max(num_elements, 1) # Increment i by the number of elements combined, with a min increment of 1return pd.DataFrame(combined, columns=df.columns) # Return as DataFramedef num_elements_equal(df, start, column, value): # Counts the number of consecutive elementsi = startnum = 0while i < len(df.iloc[:,column]):if df.iloc[i,column] == value:num += 1i += 1else:return numreturn numdef sum_elements(df, start, end, column): # Sums the elements from start to endreturn sum(df.iloc[start:end, column])tableT = combine(table)
tableT