hi I am trying to get the column name of a dataframe which contains a specific word,
eg:
i have a dataframe,
NA good employee
Not available best employer
not required well manager
not eligible super reporteemy_word=["well"]
how to check if "well" exists in a df and the column name which has "well"
thanks in Advance!
Use DataFrame.isin
for check all columns and DataFrame.any
for check at least one True
per row:
m = df.isin(my_word).any()
print (m)
0 False
1 True
2 False
dtype: bool
And then get columns names by filtering:
cols = m.index[m].tolist()
print(cols)
[1]
Data:
print (df)0 1 2
0 NaN good employee
1 Not available best employer
2 not required well manager
3 not eligible super reportee
Detail:
print (df.isin(my_word))0 1 2
0 False False False
1 False False False
2 False True False
3 False False Falseprint (df.isin(my_word).any())
0 False
1 True
2 False
dtype: bool
EDIT After converting get nested list
s, so flattening is necessary:
my_word=["well","manager"]m = df.isin(my_word).any()
print (m)
0 False
1 True
2 True
dtype: boolnested = df.loc[:,m].values.tolist()
flat_list = [item for sublist in nested for item in sublist]
print (flat_list)
['good', 'employee', 'best', 'employer', 'well', 'manager', 'super', 'reportee']