Selenium Scraping Javascript Table

2024/11/16 17:49:20

I am stuggling to scrape as per code below. Would apprciate it if someone can have a look at what I am missing? Regards PyProg70

from selenium import webdriver
from selenium.webdriver import FirefoxOptions
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from bs4 import BeautifulSoup
import pandas as pd
import re, timebinary = FirefoxBinary('/usr/bin/firefox')
opts = FirefoxOptions()
opts.add_argument("--headless")browser = webdriver.Firefox(options=opts, firefox_binary=binary)
browser.implicitly_wait(10)url = 'http://tenderbulletin.eskom.co.za/'
browser.get(url)html = browser.page_source
soup = BeautifulSoup(html, 'lxml')print(soup.prettify())
Answer

not Java but Javascript. it dynamic page you need to wait and check if Ajax finished the request and content rendered using WebDriverWait.

....
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC.....
browser.get(url)# wait max 30 second until table loaded
WebDriverWait(browser, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR , 'table.CSSTableGenerator .ng-binding')))html = browser.find_element_by_css_selector('table.CSSTableGenerator')
soup = BeautifulSoup(html.get_attribute("outerHTML"), 'lxml')
print(soup.prettify().encode('utf-8'))
https://en.xdnf.cn/q/120303.html

Related Q&A

PYTHON REGEXP to replace recognized pattern with the pattern itself and the replacement?

Text- .1. This is just awesome.2. Google just ruined Apple.3. Apple ruined itself! pattern = (dot)(number)(dot)(singlespace)Imagine you have 30 to 40 sentences with paragraph numbers in the above patt…

How can I extract the text between a/a? [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

How do I access classes and get a dir() of available actions?

I have been trying to get access to available functions for a Match Object from re.search. I am looking for a way to do that similar to how I could do dir(str) and I can find .replace.This is my dir() …

Python - IndexError: list index out of range

Why would data[entities][urls][0][expanded_url] would produce IndexError: list index out of range error? I understand what this error means but cant see why? perhaps too sleepy at 2 am? Please helpd…

Python: Use Regular expression to remove something

Ive got a string looks like thisABC(a =2,b=3,c=5,d=5,e=Something)I want the result to be likeABC(a =2,b=3,c=5)Whats the best way to do this? I prefer to use regular expression in Python.Sorry, somethi…

Python delete row in file after reading it

I python 2.7 I am reading data from file in while loop. When I successfully read row, I would like to delete this row from a file, but I dont know how to do it - Efficient way so i dont waste to much o…

Trying to keep the same type after saving a dataframe in a csv file

When I try to get my dataframe out of the csv file the type of the data changed. Is there a way I can avoid this?

Merge blocks of images to produce new image

Hi is there a way of merging specific blocks from multiple images of same size(say 100x100) and putting them together in a new image. To be more specific, consider I have a set of images which have bee…

Removing Characters from python Output

I did alot of work to remove the characters from the spark python output like u u u" [()/" which are creating problem for me to do the further work. So please put a focus on the same .I have …

How to make a tkinter entry default value permanent

I am writing a program in python that will take in specific formats, a Phone number and dollar/cent values. How can I make tkinter have default value which is permanent, not deletable. For example (XXX…