So i am using SCRAPY to scrape off the books of a website.
I have the crawler working and it crawls fine, but when it comes to cleaning the HTML using the select in XPATH it is kinda not working out right. Now since it is a book website, i have almost 131 books on each page and their XPATH comes to be likes this
For example getting the title of the books -
1st Book --- > /html/body/div/div[3]/div/div/div[2]/div/ul/li/a/span
2nd Book ---> /html/body/div/div[3]/div/div/div[2]/div/ul/li[2]/a/span
3rd book ---> /html/body/div/div[3]/div/div/div[2]/div/ul/li[3]/a/span
The DIV[] number increases with the book. I am not sure how to get this into a loop, so that it catches all the titles. I have to do this for Images and Author names too, but i think it will be similar. Just need to get this initial one done.
Thanks for your help in advance.