Pandas loc dynamic conditional list

2024/10/10 6:20:07

I have a Pandas DataFrame and I want to find all rows where the i'th column values are 10 times greater than other columns. Here is an example of my DataFrame:

Sample data

For example, looking at column i=0, row B (0.344) its is 10x greater than values in the same row but in other columns (0.001, 0, 0.009, 0). So I would like:

my_list_0=[False,True,False,False,False,False,False,False,False,False,False]

The number of columns might change hence I don't want a solution like:

#This is good only for a DataFrame with 4 columns.
my_list_i = data.loc[(data.iloc[:,i]>10*data.iloc[:,(i+1)%num_cols]) &(data.iloc[:,i]>10*data.iloc[:,(i+2)%num_cols]) &(data.iloc[:,i]>10*data.iloc[:,(i+3)%num_cols])]

Any idea? thanks.

Answer

Given the df:

df = pd.DataFrame({'cell1':[0.006209, 0.344955, 0.004521, 0, 0.018931, 0.439725, 0.013195, 0.009045, 0, 0.02614, 0],'cell2':[0.048043, 0.001077, 0,0.010393, 0.031546, 0.287264, 0.016732, 0.030291, 0.016236, 0.310639,0], 'cell3':[0,0,0.020238, 0, 0.03811, 0.579348, 0.005906, 0,0,0.068352, 0.030165],'cell4':[0.016139, 0.009359, 0,0,0.025449, 0.47779, 0, 0.01282, 0.005107, 0.004846, 0],'cell5': [0,0,0,0.012075, 0.031668, 0.520258, 0,0,0,2.728218, 0.013418]})
i = 0

You can use

(10 * df.drop(df.columns[i], axis=1)).lt(df.iloc[:,i], axis=0).all(1)

To get

0     False
1      True
2     False
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10    False
dtype: bool

for any number of columns. This drops column i, multiplies the remaining df by 10, and checks row-wise for being less than i, then returns True only if all values in the row are True. So it returns a vector of True for each row where this obtains and False for others.

If you want to give an arbitrary threshold, you can sum the Trues and divide by the number of columns - 1, then compare with your threshold:

thresh = 0.5  # or whatever you want
(10 * df.drop(df.columns[i], axis=1)).lt(df.iloc[:,i], axis=0).sum(1) / (df.shape[1] - 1) > thresh0     False
1      True
2      True
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10    False
dtype: bool
https://en.xdnf.cn/q/118483.html

Related Q&A

no module named numpy python2.7

Im using python 2.7 on Linux CentOS 6.5. After successfully using yum to install numpy, I am unable to import the module.from numpy import *The above code produces the following error:no module named …

TypeError: list indices must be integers or slices, not tuple for list of tuples

I am getting "list indices must be integers or slices, not tuple" error while trying to generate list from list of tuples. list of tuples have the following structure:[(29208, 8, 8, 8), (2920…

Spark Unique pair in cartesian product

I have this:In [1]:a = sc.parallelize([a,b,c]) In [2]:a.cartesian(a).collect() Out[3]: [(a, a), (a, b), (a, c), (b, a), (c, a), (b, b), (b, c), (c, b), (c, c)]I want the following result:In [1]:a = sc.…

How to use double click bid manager(DBM) API in python

I am trying to use the google Double click bid manager (DBM) API, to download reports, I am trying to make this automatic without manual authentication, but all I can find is the GitHub repo for DBM sa…

How can I replace a value in an existing excel csv file using a python program?

How can I update a value in an existing .csv file using a python program. At the moment the file is read into the program but I need to be able to change this value using my program, and for the change…

Why might Python break down halfway through a loop? TypeError: __getitem__

The GoalI have a directory with 65 .txt files, which I am parsing, one by one, and saving the outputs into 65 corresponding .txt files. I then plan to concatenate them, but Im not sure if jumping strai…

RoboBrowser getting type error NoneType object is not subscriptable

Im trying to make a kahoot spammer which inputs a pin number and a username, decided by the user. Im getting a type error when I run this code:import re from robobrowser import RoboBrowser#Getting pin …

How to create a 2d list with all same values but can alter multiple elements within? (python)

Im trying to create a list that holds this exact format: [[2],[2],[2],[2],[2],[2],[2],[2],[2],[2]]and when list[3][0] = 9 is called, the list becomes [[2],[9],[2],[9],[2],[9],[2],[9],[2],[9]]How do I c…

Understanding Function Closures [duplicate]

This question already has answers here:Why arent python nested functions called closures?(10 answers)Closed 9 years ago.Im struggling to understand Function closures properly. For example in the code …

Update value for every row based on either of two previous columns

I am researching ATP Tour male tennis data. Currently, I have a Pandas dataframe that contains ~60,000 matches. Every row contains information / statistics about the match, split between the winner and…