How to scrape data using next button with ellipsis using Scrapy

2024/10/8 17:08:08

I need to continuously get the data on next button <1 2 3 ... 5> but there's no provided href link in the source also there's also elipsis. any idea please? here's my code

def start_requests(self):urls = ((self.parse_2, 'https://www.forever21.com/us/shop/catalog/category/f21/sale'),)for cb, url in urls:yield scrapy.Request(url, callback=cb)def parse_2(self, response):for product_item_forever in response.css('div.pi_container'):forever_item = {'forever-title': product_item_forever.css('p.p_name::text').extract_first(),'forever-regular-price': product_item_forever.css('span.p_old_price::text').extract_first(),'forever-sale-price': product_item_forever.css('span.p_sale.t_pink::text').extract_first(),'forever-photo-url': product_item_forever.css('img::attr(data-original)').extract_first(),'forever-description-url': product_item_forever.css('a.item_slider.product_link::attr(href)').extract_first(),}yield forever_item

Please help me thank you

Answer

It seems this pagination uses additional request to API. So, there are two ways:

  1. Use Splash/Selenium to render pages by pattern of QHarr;
  2. Make same calls to API. Check developer tools, you will find POST-request https://www.forever21.com/us/shop/Catalog/GetProducts will all proper params (they are too long, so I will not post full list here).
https://en.xdnf.cn/q/118577.html

Related Q&A

Execution Code Tracking - How to know which code has been executed in project?

Let say that I have open source project from which I would like to borrow some functionality. Can I get some sort of report generated during execution and/or interaction of this project? Report should…

Python code to ignore errors

I have a code that stops running each time there is an error. Is there a way to add a code to the script which will ignore all errors and keep running the script until completion?Below is the code:imp…

How to match background color of an image with background color of Pygame? [duplicate]

This question already has an answer here:How to convert the background color of image to match the color of Pygame window?(1 answer)Closed 3 years ago.I need to Make a class that draws the character a…

Sharepoint/SOAP - GetListItems ignoring query

Trying to talk from Python to Sharepoint through SOAP.One of the lists I am trying to query contains ID as primary key field.(Field){_RowOrdinal = "0"_FromBaseType = "TRUE"_DisplayN…

python mean between file

I create a list of more than a thousand file (Basically now I have a list with the name of the file) now in order to make the man I thought to do something like this (suppose asch file have 20 lines): …

Sort a dictionary with custom sorting function

I have some JSON data I read from a file using json.load(data_file){"unused_account":{"logins": 0,"date_added": 150},"unused_account2":{"logins": 0,&qu…

Turtle make triangle different color

Hi guys Im trying to replicate this image:Its almost done I just have one issue, where the triangle is supposed to be yellow it isnt seeming to work.Mine:Code:fill(True) fillcolor(green) width(3) forwa…

How to DataBricks read Delta tables based on incremental data

we have to read the data from delta table and then we are joining the all the tables based on our requirements, then we would have to call the our internal APIS to pass the each row data. this is our g…

Converting an excel file to a specific Json in python using openpyxl library with datetime

I have the Excel data with the format shown in the image preview. How can I convert it into a JSON using Python? Expected Output: file_name = [ { A: Measurement( calculated_date=datetime(2022, 10, 1, …

How to find a word in a string in a list? (Python)

So im trying to find a way so I can read a txt file and find a specific word. I have been calling the file with myfile=open(daily.txt,r)r=myfile.readlines()that would return a list with a string for ea…