How to sequence row based on another row?

2024/7/7 5:52:24

I am trying to convert a formula from excel to pandas.

The DataFrame looks like this:

Column A    Column B 
H  
H  
H  
J  
J  
J  
J  
K  
K  

I want to fill column B to increment while the value in column A remains the same. In the example above, this would be:

Column A     Column B
H            1
H            2
H            3
J            1
J            2
J            3
J            4
K            1
K            2

In excel, the formula would be =IF(A2<>A1,1,B1+1)

How can I apply this formula in pandas?

Answer

This can be done using the following vectorised method:

Code:

>>> df = pd.DataFrame({'A':['H', 'H', 'H', 'J', 'J', 'J', 'J', 'K', 'K']})
>>> df['B'] = df.groupby((df['A'].shift(1) != df['A']).cumsum()).cumcount() + 1

Output:

>>> dfA  B
0  H  1
1  H  2
2  H  3
3  J  1
4  J  2
5  J  3
6  J  4
7  K  1
8  K  2

Explanation:

First, we use df['A'].shift(1) != df['A'] to compare column A with column A shifted by 1. This yields:

>>> df['A'] != df['A'].shift(1)
0     True
1    False
2    False
3     True
4    False
5    False
6    False
7     True
8    False
Name: A, dtype: bool

Next, we use cumsum() to return the cumulative sum over that column. This gives us:

>>> (df['A'] != df['A'].shift(1)).cumsum()
0    1
1    1
2    1
3    2
4    2
5    2
6    2
7    3
8    3
Name: A, dtype: int32

Now, we can use GroupBy.cumcount() as usual to enumerate each item in ascending order, adding 1 to start the index at 1. Note that we can't just use

df.groupby('A').cumcount()

Because if, for example, we had:

>>> dfA
0  H
1  H
2  H
3  J
4  J
5  J
6  J
7  K
8  K
9  H

This would give us:

>>> df.groupby('A').cumcount() + 1
0    1
1    2
2    3
3    1
4    2
5    3
6    4
7    1
8    2
9    4
dtype: int64

Note that the final row is 4 and not 1 as expected.

https://en.xdnf.cn/q/120735.html

Related Q&A

Multiclassification task using keras [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 3…

Clarification needed regarding immutability of strings in Python [closed]

Closed. This question is seeking recommendations for software libraries, tutorials, tools, books, or other off-site resources. It does not meet Stack Overflow guidelines. It is not currently accepting …

Please help me in solving Fractional Knapsack problem (Maximum Value of the Loot)

Maximum Value of the LootProblem Introduction: A thief finds much more loot than his bag can fit. Help him to find the most valuable combination of items assuming that any fraction of a loot item can b…

Python countdown clock with GUI [duplicate]

This question already has an answer here:Making a countdown timer with Python and Tkinter?(1 answer)Closed 8 years ago.Im having problems with a countdown clock that I was making in Python for a Raspb…

Confidence calculation in association rule [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, argum…

How to write the names that start with A - L to one file and the rest to another?

Hello my assignment is :Create a system that allows the user to enter their name, title, surname, Dob, email and phone number. Once details are submitted, they should be written to a file. Surnames tha…

How is it possible to use a while loop to print even numbers 2 through 100?

I am a beginner and I am stuck on this problem, "Write a python code that uses a while loop to print even numbers from 2 through 100. Hint ConsecutiveEven differ by 2."Here is what I came up …

Issue with buttons not functioning after start of program

I am new and learning python 3.6 and Ive almost completed my first code project. After doing an exhaustive search to resolve my problem I have not been able to find the answer to what I am sure is a si…

explanation of C implementation pythons len function [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 5…

How to compare the attributes start with $ in 2 functions and display match or mismatch

My input file contain attributes if(match($OPTION_EnableDetails, "1") or match($OPTION_EnableDetails_juniper, "1")) {details($juniFileXferStatus,$juniFileXferTimeStamp,$juniFileXfer…