I have search the specific brand Samsung , for this number of products are search ,I just wanted to scrape all the href from the of the search products with the product name .
enter code here
import urllib.request
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.support.select import Select
from selenium.webdriver.common.keys import Keys
chrome_path =r'C:/Users/91940/AppData/Local/Programs/Python/Python39/Scripts/chromedriver.exe'
driver = webdriver.Chrome(executable_path=chrome_path)
driver.implicitly_wait(10)
url = "https://www.lazada.sg"
driver.get(url)
driver.maximize_window()
soup=BeautifulSoup(driver.page_source, 'lxml')
application = driver.find_element_by_id("q")
application.send_keys("Samsung")
driver.find_element_by_css_selector(".search-box__button--1oH7").click()div = driver.find_elements_by_tag_name('div', {'class': 'GridItem__title___8JShU'})
print(len(div))
for ele in div :print(a.get_attribute("href")
Couple of things. You are trying to mix bs4 syntax with selenium which is causing your current error. Additionally, you are targeting potentially dynamic values. Finally, there are anti-scraping measures which may later impact on your work.
Ignoring the last, a more robust, syntax appropriate version, might be:
div = driver.find_elements_by_css_selector('[data-tracking="product-card"]')
links = [i.find_element_by_css_selector('[age="0"]').get_attribute('href') for i in div]
print(links)
You could reduce this just to a list comprehension with a different css selector combination e.g.:
links = [i.get_attribute('href') for i in driver.find_elements_by_css_selector('[data-tracking="product-card"] div:nth-child(1) > [href*=search]')]
For that last one, you can return dict with product name as follows:
{i.find_element_by_tag_name('img').get_attribute('alt'):i.get_attribute('href') for i in driver.find_elements_by_css_selector('[data-tracking="product-card"] div:nth-child(1) > [href*=search]')}
As a dataframe:
import pandas as pdpd.DataFrame([(i.find_element_by_tag_name('img').get_attribute('alt'), i.get_attribute('href')) for i in driver.find_elements_by_css_selector('[data-tracking="product-card"] div:nth-child(1) > [href*=search]')], columns = ['Title', 'Link'])