I want to make a parser for scraping price, however I can't find the working method of parsing innerHTML
I don't know why, but selenium (getAttribute(innerHTML)), phantomjs (page.evaluation function(){return document.ElementToParse.innerHTML}) and scrapy-splash (loaded a webpage using WebPageEngine and parse html) don't work. All the time, result is empty "[]", null or webelement
I test my code on banggood's products and also on landing page but result is always the same.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ECdriver = webdriver.Firefox()
driver.get("https://www.banggood.com/BlitzWolf-Ampcore-Turbo-TC10-3A-Durable-USB-Type-C-Charging-Data-Cable-p-1188424.html?rmmds=category&cur_warehouse=CN") #random url
try:element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "item_now_price")))
finally:driver.quit()
print(element)
and output:
<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="b0593791-138b-4177-a8f3-e7983143824a", element="d08f4717-d3f1-4594-8f2b-1bf943deb9f9")>
when need something like:
6.59(or US$6.59)
i also tried
price = driver.find_element_by_class_name('item_now_price').getAttribute("innerHTML")
and
var page = require('webpage').create();page.open('https://www.banggood.com/BlitzWolf-Ampcore-Turbo-TC10-3A- Durable-USB-Type-C-Charging-Data-Cable-p-1188424.html?rmmds=category&cur_warehouse=CN', function(status) {var price = page.evaluate(function() {return document.getElementByClassName('item_now_price').innerHTML;});
console.log('price is ' + price);
phantom.exit();
});
but result is null and when i add
page.includeJs(/url/to/js)
terminal stops working
s