How to count IDs that have a given combination of flags?

2024/11/20 11:34:12

I have dataframe like that. I need to choose and count all distinct users, who have title "banner_click" and "order". So I don't understand how to do it in pandas, in SQL you do like UniqExactIf and go on, there I need to find all users, who had both these title.

     user         title
0  user_0  banner_click  # has both "banner_click" and "order"
1  user_0         order  #
2  user_1   banner_show
3  user_1         order
4  user_2         order  # also has both "banner_click" and "order"
5  user_2  banner_click  #

I have tried "in" but I guess it doesn't work correct

main = df.query("title in ('banner_click','order')").agg({'user':'nunique'})

Expected output: 2 (user_0 and user2 are a match)

Reproducible example:

df = pd.DataFrame({'user': ['user_0', 'user_0', 'user_1', 'user_1', 'user_2', 'user_2'],'title': ['banner_click', 'order', 'banner_show', 'order', 'order', 'banner_click']})
Answer

You can use isin() to retrieve the relevant rows and then use df.nunique().

df.loc[df['title'].isin(['banner_click', 'order'])]['user'].nunique()

https://en.xdnf.cn/q/119835.html

Related Q&A

How to calculate quarterly wise churn and retention rate using python

How to calculate quarterly wise churn and retention rate with date column using python. with date column i want to group that quarterly using python.This is used to calculate the churn count groupby qu…

splitting a list into two lists based on a unique value

I have a text file that looks something like this:hello 12 hello 56 world 25 world 26Is there a way in python that I can somehow parse the list that I obtain from reading this data to obtain two separa…

Customize axes in Matplotlib

I am a beginner with Python, Pandas and Matplotlib. I would like to customize the entries at the axes of a scatter plot. I have the following data:So on the x-axis there should be 5 entries, with the f…

how to show the max and min from user input?

Nevermindif xmin == 1:print(ymin)I tried using the max and min but I get a typeerror which is int object not iterable.

To output a string without whitespce

My program:def string_splosion(str):j=1c=len(str)i=0s=while(i<c):s=s+(str[:i+j])i=i+1print sprint("Enter a string:") s=raw_input() string_splosion(s)Sample input:Code Expected output:CCoCo…

Regex replace `a.b` to `a. b`? [duplicate]

This question already has answers here:Python: Replace with regex(2 answers)Closed 6 years ago.I am trying to change all strings of the type a.b to a. b. It should also work if any of the characters ar…

Is there something wrong with this line of Python code?

I tried this line of code, and it kept giving me the SyntaxError.print(/ / - / \ / | * 30, end=\r)^It pointed on the brackets. Any suggestions? Thanks!

TypeError: tuple indices must be integers or slices, not str postgres/python

Hi Im trying to get a timestamp from a row called time in my postgres database, however Im getting the error. Its the second row in the database. The error TypeError: tuple indices must be integers or …

Pandas Python: KeyError Date

I am import into python where it will automatically create a date time object.However I want the first column to be a datetime object in Python. Data looks likeDate,cost 41330.66667,100 41331.66667,101…

How to resample 1 minute data into 15 minute data?

CSV file. df before resample and after applying: df["dateandtime"] = (pd.to_datetime(df.pop("DATE").str.cat(df.pop("TIME"), sep=" "))) df = df.set_index(pd.Datet…