Question 1

I am trying to data scrape from a certain website. I am using Selenium so that I can log myself in, and then start parsing through data.

I have 3 main errors:

Last page # not loading properly. here I am loading "1" when it should be "197" and I believe this is happening because of the load associated with the website
element 'test' xpath not being found properly. I commented out in last for loop.

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//div[1]/div[@class='col-lg-3 col-sm-3 result-info' and 2]/span[@class='brand-name' and 1]"}

finally, I am trying to click last page to test if that works, but I am getting an error that Element is not found.

selenium.common.exceptions.ElementNotVisibleException: Message: element not visible

This is my code

import selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutExceptionurl = "https://marketplace.refersion.com/"
username = "[email protected]"
password = "testpass123"
driver = webdriver.Chrome("/Users/xxx/Downloads/chromedriver")if __name__ == "__main__":driver.get(url)driver.find_element_by_xpath("/html/body/div[@class='wrapper']/div[@class='top-block']/header[@class='header clearfix']/div[@class='login-button']/a[@class='login-link']").click()driver.find_element_by_id("email").send_keys(username)  # enters the username in textboxdriver.find_element_by_xpath("/html/body/div[@id='app']/div[@class='top-block']/div[@class='row']/div[@class='col-xs-12 col-sm-10 col-sm-offset-1 col-md-8 col-md-offset-2 col-lg-6 col-lg-offset-3 main-section']/div[@class='main-section-content']/div/form[@class='form-horizontal']/div[@class='form-group ']/div[@class='col-xs-12 col-sm-10 col-sm-offset-1 input-group input-group-lg']/input[@id='password']").send_keys(password)  # enters the password in textbox# Find the submit button using class name and click on it.driver.find_element_by_class_name("btn-primary").click()driver.find_element_by_link_text("Find Offers").click()driver.find_element_by_id("sorting-dropdown").click()  # enters the username in textboxdriver.find_element_by_link_text("Newest First").click()last_page = driver.find_element_by_class_name("right-center").textprint(last_page)# try:#     last_page = WebDriverWait(driver, 3).until(EC.presence_of_element_located((By.CLASS_NAME, 'right-center')))#     print("Page is ready!")# except TimeoutException:#     print("Loading took too much time!")for i in range(1, 10):#   test = driver.find_element_by_xpath("//div[1]/div[@class='col-lg-3 col-sm-3 result-info' and 2]/span[@class='brand-name' and 1]")#  print(test)WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, 'hover-link'))).click()

I think this has to do with the way the page is being loaded. My question is, is there any work around to something like this?

Question 2

You should have explicit waits in your code to handle the dynamic loading of the pages. Sorting the page by "Newest First" causes it to refresh the results and introduces a spinner to indicate the sorting.

<i class="fa fa-spinner fa-spin" aria-hidden="true" style="font-size: 48px;"></i>

Waiting for the spinner to disappear should give you the correct page count. Something on the following lines:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
.....
# your login code
.....driver.find_element_by_link_text("Newest First").click()
element = WebDriverWait(driver, 10).until(EC.invisibility_of_element_located((By.XPATH, "//i[@class='fa fa-spinner fa-spin']"))
)
last_page = driver.find_element_by_class_name("right-center").text

To find all the brand names listed on the page, you need to find all the span tags with class='brand-name' by calling the method find_elements_by_xpath(plural, elements)

brand_names_list = driver.find_elements_by_xpath("//span[@class='brand-name']")
for brand_name in brand_name_list:print brand_name.text

Selenium load time errors - looking for possible workaround

Related Q&A

How to POST ndb.StructuredProperty?

Issue computing difference between two csv files

How do I display an extremly long image in Tkinter? (how to get around canvas max limit)

NoneType has no attribute IF-ELSE solution

looking for an inverted heap in python

Concatenating Multiple DataFrames with Non-Standard Columns

Python Conditionally Add Class to td Tags in HTML Table

Sort a dictionary of dictionaries python

How can I get python generated excel document to correctly calculate array formulas

Unable to locate element in Python Selenium