So I'm using Beautiful Soup to try to get an element off of a page using the tag and class. Here is my code:
import requests
from bs4 import BeautifulSoup# Send a GET request to the webpage
url = "https://www.hindawi.com/journals/am/2021/1623076/"
response = requests.get(url)# Parse the HTML content of the webpage
soup = BeautifulSoup(response.text, 'html.parser')results = soup.find_all('span', class_ = 'simpleShowMore')
print(results)
which I pretty much took directly from their example. If you look at this website, there are values there and it's not being found by BS. The site looks like this:
The output of this is:
[]
I'm sure I am doing something very simple wrong. I believe a lot of the examples I found are out of date. Please help?
Thanks
I assume that you would like to obtain the address of all institutions.
If you analyze the source code of the website, you can see that the information is, contradictory to the other answers and comments, already loaded and will be displayed by a script afterwards. Since this is the case, you do not need to use Selenium or any other tool to perform the click to load the data.
Here is an example of how you can access the information:
from bs4 import BeautifulSoup
import requests
import json# Send a GET request to the webpage
url = "https://www.hindawi.com/journals/am/2021/1623076/"
response = requests.get(url)# Parse the HTML content of the webpage
soup = BeautifulSoup(response.text, 'html.parser')# Extract text of the element containing the data that could get loaded
new_data = soup.find(attrs={'id': '__NEXT_DATA__'}).text#
data = json.loads(new_data)# Extract Data and transform it into an array
join_address_lines = lambda entry: ', '.join([line['addrLine1'] + ', ' + line['addrLine2'] + ', ' + line['addrLine3'] for line in entry['addrLines']])
address_lines_array = list(map(join_address_lines, data['props']['pageProps']['article']['affiliations']))# Printing the results
print(address_lines_array)
# --> ['School of Architecture and Materials, Chongqing College of Electronic Engineering, Chongqing 401331', 'College of Geography and Tourism, Chongqing Normal University, Chongqing 401331']