I am trying to scrape following dynamically generated webpage
https://www.governmentjobs.com/careers/capecoral?page=1
I've used requests, scrapy, scrapy-splash but I simply get page source code and I don't get any job listing.
import requests
from bs4 import BeautifulSoup`
r = requests.get("https://www.governmentjobs.com/careers/capecoral?page=1")
soup = BeautifulSoup(r.content)
n_jobs = soup.select("#number-found-items")[0].text.strip()
print(n_jobs)
It always returns 0 jobs found
As the url is dynamic that's why you can use selenium with bs4 to get the desired data. Here is an example.Please, just run the code.
import time
from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManagerurl = "https://www.governmentjobs.com/careers/capecoral?page=1"driver = webdriver.Chrome(ChromeDriverManager().install())
driver.maximize_window()
time.sleep(8)
driver.get(url)
time.sleep(10)soup = BeautifulSoup(driver.page_source, 'lxml')for title in soup.select('.list-item h3 > a'):print(title.text)
Output:
Assistant City Attorney / City Attorney's Office
Business Applications Analyst II / Information Technology Services #6425
Contract Athletic Official / Athletics / Parks & Recreation #6237
Contract Background Investigation Specialist / Investigations / Police Dept. #6514
Contract Beverage Cart/Waiter/Waitress / Parks and Recreation / Coral Oaks #6479
Contract Counselor / Youth Center / Parks & Recreation #6317
Contract Counselor/Instructor / Parks & Recreation / Special Populations #6339
Contract Custodial Worker / Lake Kennedy / Parks & Recreation #6525
Contract Custodial Worker /Parks & Recreation / Yacht Club #6312
Contract Golf Course Outside Operations / Parks & Recreation / Coral Oaks #6535