Beautiful Soup find elements having hidden style

2024/10/10 2:22:57

My simple need. How do I find elements that are not visible on the webpage currently? I am guessing style="visibility:hidden" or style="display:none" are simple ways to hide an element, but BeautifulSoup doesn't know if its hidden or not.

For example, HTML is:

Textbox_Invisible1: <input id="tbi1" type="text" style="visibility:hidden">
Textbox_Invisible2: <input id="tbi2" type="text" class="hidden_elements">
Textbox1: <input id="tb1" type="text">

So my first concern is that BeautifulSoup cannot find out if any of the above textboxes are hidden:

# Python 2.7
# Import BeautifulSoup
>>> source = """Textbox_Invisible1: <input id="tbi1" type="text" style="visibility:hidden">
...  Textbox_Invisible2: <input id="tbi2" type="text" class="hidden_elements">
...  Textbox1: <input id="tb1" type="text">"""
>>> soup1 = BeautifulSoup(source)
>>> soup1.find(id='tb1').hidden
False
>>> soup1.find(id='tbi1').hidden
False
>>> soup1.find(id='tbi2').hidden
False
>>> 

My only question is, is there a way to find out which elements are hidden? (We have to consider the complex HTML also where the having elements might be hidden)

Answer

BeautifulSoup is an html parser, not a browser. It doesn't know anything about how the page is supposed to be rendered, calculated DOM attributes etc, it's checking where the angle brackets begin and end.

If you need to work with the DOM at runtime, you'd be better off with a browser automation package, i.e. something that will start the browser, let the browser consume the page, and then expose browser controls and the calculated DOM. Depending on the platform, you have different options. Have a look at this page on the Python WIki for ideas, check the section Python Wrappers around Web "Libraries" and Browser Technology.

https://en.xdnf.cn/q/69942.html

Related Q&A

Fastest way to extract dictionary of sums in numpy in 1 I/O pass

Lets say I have an array like:arr = np.array([[1,20,5],[1,20,8],[3,10,4],[2,30,6],[3,10,5]])and I would like to form a dictionary of the sum of the third column for each row that matches each value in …

How to group by and dummies in pandas

I have a pandas dataframe: key valA 1A 2B 1B 3C 1C 4I want to get do some dummies like this:A 1100b 1010c 1001

Iterate over a dict except for x item items

I have a dict in this format:d_data = {key_1:value_1,key_2:value_2,key_3:value_3,key_x:value_x,key_n:value_n}and I have to iterate over its items:for key,value in columns.items():do somethingexcept for…

Best way to do a case insensitive replace but match the case of the word to be replaced?

So far Ive come up with the method below but my question is is there a shorter method out there that has the same result?My Code :input_str = "myStrIngFullOfStUfFiWannAReplaCE_StUfFs" …

Given a list of numbers, find all matrices such that each column and row sum up to 264

Lets say I have a list of 16 numbers. With these 16 numbers I can create different 4x4 matrices. Id like to find all 4x4 matrices where each element in the list is used once, and where the sum of each …

How can I access tablet pen data via Python?

I need to access a windows tablet pen data (such as the surface) via Python. I mainly need the position, pressure, and tilt values.I know how to access the Wacom pen data but the windows pen is differe…

Read Celery configuration from Python properties file

I have an application that needs to initialize Celery and other things (e.g. database). I would like to have a .ini file that would contain the applications configuration. This should be passed to th…

numpys tostring/fromstring --- what do I need to specify to restore the array

Given a raw binary representation of a numpy array, what is the complete set of metadata needed to unambiguously restore the array? For example, >>> np.fromstring( np.array([42]).tostring())…

How to limit width of column headers in Pandas

How can I limit the column width within Pandas when displaying dataframes, etc? I know about display.max_colwidth but it doesnt affect column names. Also, I do not want to break the names up, but rath…

Django + Auth0 JWT authentication refusing to decode

I am trying to implement Auth0 JWT-based authentication in my Django REST API using the django-rest-framework. I know that there is a JWT library available for the REST framework, and I have tried usin…