When scraping all the div to get the data getting the null list using lxml in python

2024/11/16 13:56:43

I want to scrape the product title , product link , product price but when I am using the xpath it is showing the null list . How to add the xpath and for loop to get the above details . I have tried this

import requests
import lxml.htmlhtml =  requests.get("https://www.lazada.sg/catalog/? q=Samsung+Mobile&_keyori=ss&from=input&spm=a2o42.home.search.go.654346b52P3y8Y")
doc = lxml.html.fromstring(html.content)
#print(doc)new = doc.xpath('//div[@class ="index__box___1Ffv-"]')
print(len(new))
for node in new:title = node.xpath('//* [@id="root"]/div/div[2]/div[1]/div/div[1]/div[2]/div[1]/div/div/div[2]/div[2]/text()')print(len(title))print(title)

I need to get all the product details as mention above. I want this by using only lxml library.

Answer

The main problem you are running into is that the information you want is not in doc. To see what you're working with, try something like:

f = open('title_dump.html', 'wb')
f.write(html.content)

Open it in a browser or text editor, and you'll notice that titles, prices and so on are missing. Most likely, they are being filled in by Javascript. As such, this may not be possible using only lxml.

When scraping all the div to get the data getting the null list using lxml in python

Related Q&A

Python how to convert this for loop into a while loop [duplicate]

Joining elements in Python list

Python - Count letters in random strings

Looping until a specific key is pressed [duplicate]

Getting sub-dataframe after sorting and groupby

Use .vss stencil file to generate shapes by python code (use .vdx?)

How can I capture detected image of object Yolov3 and display in flask

ValueError: Shapes (2, 1) and () are incompatible

Subtotals for Pandas pivot table index and column

Create two new columns derived from original columns in Python