Scrapy returns only first result

2024/10/15 3:18:38

I'm trying to scrape data from gelbeseiten.de (yellow pages in germany)

# -*- coding: utf-8 -*-
import scrapyfrom scrapy.spiders import CrawlSpiderfrom scrapy.http import Requestfrom scrapy.selector import Selectorfrom scrapy.http import HtmlResponseclass GelbeseitenSpider(scrapy.Spider):name = "gelbeseiten"allowed_domains = ["http://www.gelbeseiten.de"]start_urls = ['http://www.gelbeseiten.de/zoohandlungen/s1/alphabetisch']def parse(self, response):for adress in response.css('article'):#Strassestrasse = adress.xpath('//span[@itemprop="streetAddress"]//text()').extract_first()#Namename = adress.xpath('//span[@itemprop="name"]//text()').extract_first()#PLZplz = adress.xpath('//span[@itemprop="postalCode"]//text()').extract_first()#Stadtstadt = adress.xpath('//span[@itemprop="addressLocality"]//text()').extract_first()yield {'name': name,'strasse': strasse,'plz': plz,'stadt': stadt,}

As the result i get 15 sets with always the same address but i think it should be 15 different addresses.

I appreciate any help.

Answer

You use absolute xpath expressions:

adress.xpath('//span[@itemprop="streetAddress"]//text()')

while should use relative to address (note leading dot in expression):

adress.xpath('.//span[@itemprop="streetAddress"]//text()')

https://en.xdnf.cn/q/117879.html

Related Q&A

Softlayer getAllBillingItems stopped working?

The following python script worked like a charm last month:Script:import SoftLayer client = SoftLayer.Client(username=someUser, api_key=someKey) LastInvoice = client[Account].getAllBillingItems() print…

Looking for a specific value in JSON file

I have a json file created by a function. The file is looks like this :{"images": [{"image": "/WATSON/VISUAL-REC/../IMAGES/OBAMA.jpg", "classifiers": [{"cla…

How to put many numpy files in one big numpy file without having memory error?

I follow this question Append multiple numpy files to one big numpy file in python in order to put many numpy files in one big file, the result is: import matplotlib.pyplot as plt import numpy as np i…

scraping : nested url data scraping

I have a website name https://www.grohe.com/in In that page i want to get one type of bathroom faucets https://www.grohe.com/in/25796/bathroom/bathroom-faucets/grandera/ In that page there are multiple…

How to trigger an action once on overscroll in Kivy?

I have a ScrollView thats supposed to have an update feature when you overscroll to the top (like in many apps). Ive found a way to trigger it when the overscroll exceeds a certain threshold, but it tr…

Python - Print Each Sentence On New Line

Per the subject, Im trying to print each sentence in a string on a new line. With the current code and output shown below, whats the syntax to return "Correct Output" shown below?Codesentenc…

pyinstaller struct.error: unpack requires a bytes object of length 16 [closed]

Closed. This question needs debugging details. It is not currently accepting answers.Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to repro…

Getting the quarter where recession start and recession ends along with the quarter of minimum gdp

Quarter: GDP: GDP change: change 1999q3 9 -- ------ 1999q4 10 1 increase 2000q1 9 -1 decline 2000q2 8 -1 de…

Inherit view and adding fields

I want to add my 2 fields boatlenght and fuelcapacity under price list in product form view but they are not showing up. What did i miss.<?xml version="1.0" encoding="utf-8"?&g…

Linux and python: Combining multiple wave files to one wave file

I am looking for a way that I can combine multiple wave files into one wave file using python and run it on linux. I dont want to use any add on other than the default shell command line and default py…