Question 1

I have a dataframe df_in like so:

import pandas as pd
dic_in = {'A':['aa','aa','bb','cc','cc','cc','cc','dd','dd','dd','ee'],'B':['200','200','200','400','400','500','700','700','900','900','200'],'C':['da','cs','fr','fs','se','at','yu','j5','31','ds','sz']}
df_in = pd.DataFrame(dic_in)

I would like to investigate the 2 columns A and B in the following way. I 2 consecutive rows[['A','B']] are equal then they are assigned a new value (according to a specific rule which i am about to describe). I will give an example to be more clear: If the first row[['A','B']] is equal to the following one, then I set 1; if the second one is equal to the third one then I will set 1. Every time two consecutive rows are different, then I increase the value to set by 1.

The result should look like this:

     A    B   C  value
0   aa  200  da      1
1   aa  200  cs      1
2   bb  200  fr      2
3   cc  400  fs      3
4   cc  400  se      3
5   cc  500  at      4
6   cc  700  yu      5
7   dd  700  j5      6
8   dd  900  31      7
9   dd  900  ds      7
10  ee  200  sz      8

Can you suggest me a smart one to achieve this goal?

Question 2

Use shift and any to compare consecutive rows, using True to indicate where the value should change. Then take the cumulative sum with cumsum to get the increasing value:

df_in['value'] = (df_in[['A', 'B']] != df_in[['A', 'B']].shift()).any(axis=1)
df_in['value'] = df_in['value'].cumsum()

Alternatively, condensing it to one line:

df_in['value'] = (df_in[['A', 'B']] != df_in[['A', 'B']].shift()).any(axis=1).cumsum()

The resulting output:

     A    B   C  value
0   aa  200  da      1
1   aa  200  cs      1
2   bb  200  fr      2
3   cc  400  fs      3
4   cc  400  se      3
5   cc  500  at      4
6   cc  700  yu      5
7   dd  700  j5      6
8   dd  900  31      7
9   dd  900  ds      7
10  ee  200  sz      8

Compare 2 consecutive rows and assign increasing value if different (using Pandas)

Related Q&A

searching for k nearest points

Why does del (x) with parentheses around the variable name work?

How to concisely represent if/else to specify CSS classes in Django templates

LabelEncoder: How to keep a dictionary that shows original and converted variable

How to find hidden files inside image files (Jpg/Gif/Png) [closed]

How to open a simple image using streams in Pillow-Python

SyntaxError: Non-UTF-8 code starting with \x82 [duplicate]

How to identify the CPU core ID of a process on Python multiprocessing?

Finding highest values in each row in a data frame for python

Using pytest_addoptions in a non-root conftest.py