I want to capture the field or column name that is filtered in the excel file when reading through python. I saw that we can also capture only the filtered rows by using openpyxl and using hidden == False (How to import filtered excel table into python?). In my project it is important to identify which field/column is filtered in the excel file. Is it possible? and how to achieve?
Adding an example.
pip install openpyxl
from openpyxl import load_workbookwb = load_workbook('test_filter_column.xlsx')ws = wb['data']
This is non hidden data
while if the gender column is filtered below
So what I am expecting is my output should be giving gender as that is filtered. If more than one field is filtered then expecting to provide all the filtered column names.
Inspired by this post, but adapted to your case, and also accounting for when there are multiple filters:
from openpyxl import load_workbook
from openpyxl.utils import get_column_letter# Load workbook
wb = load_workbook('/path/to/xlsx')
# Extract sheet
ws = wb['data']# Create a dict to store relevant info
filters = {}# Get the ID of the columns that are filtered
filters['col_id'] = [col.col_id for col in ws.auto_filter.filterColumn]# Get the letter of the columns that are filtered
# This appears to be one-indexed, but get_column_letter
# is zero indexed
filters['col_letter'] = [get_column_letter(col + 1) for col in filters['col_id']]# Extract the column name - assuming that it is
# given in row 1 of the column
filters['col_name'] = [ws[f'{col}1'].value for col in filters['col_letter']]# Get the values of the filters being used
filters['filter_values'] = [col.filters.filter for col in ws.auto_filter.filterColumn]print(filters)
Output:
{'col_id': [3], 'col_letter': ['D'], 'col_name': ['gender'], 'filter_values': [['F']]}
I think this covers the examples you gave, and hopefully shows some other info you might want too.