Question 1

So i am using SCRAPY to scrape off the books of a website.

I have the crawler working and it crawls fine, but when it comes to cleaning the HTML using the select in XPATH it is kinda not working out right. Now since it is a book website, i have almost 131 books on each page and their XPATH comes to be likes this

For example getting the title of the books -

1st Book --- > /html/body/div/div[3]/div/div/div[2]/div/ul/li/a/span
2nd Book --->  /html/body/div/div[3]/div/div/div[2]/div/ul/li[2]/a/span 
3rd book --->  /html/body/div/div[3]/div/div/div[2]/div/ul/li[3]/a/span

The DIV[] number increases with the book. I am not sure how to get this into a loop, so that it catches all the titles. I have to do this for Images and Author names too, but i think it will be similar. Just need to get this initial one done.

Thanks for your help in advance.

Question 2

There are different ways to get this

Best to select multiple nodes is, selecting on the basis of ids or class. e.g:
```
sel.xpath("//div[@id='id']")
```

You can select like this

for i in range(0, upto_num_of_divs):list = sel.xpath("//div[%s]" %i)

You can select like this

for i in range(0, upto_num_of_divs):list = sel.xpath("//div[position > =1 and position() < upto_num_of_divs])

XPATH for Scrapy

Related Q&A

Kivy Removing elements from a Stack- / GridLayout

Bad timing when playing audio files with PyGame

Using read_batch_record_features with an Estimator

How to insert integers into a list without indexing using python?

Separate/reposition/translate shapes in image with pillow in python

django.db.utils.OperationalError: (1045, Access denied for user user@localhost

Paho Python Client with HiveMQ

Comparison of value items in a dictionary and counting matches

how to send cookies inside post request

Flask-Uploads gives AttributeError?