Question 1

I am trying to use Scrapy to extract data from page. But I get an empty output. What is the problem?

spider:

class Ratemds(scrapy.Spider):name = 'ratemds'allowed_domains = ['ratemds.com']custom_settings = {'USER_AGENT': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36 OPR/60.0.3255.50747 OPRGX/60.0.3255.50747',}def start_requests(self): yield scrapy.Request('https://www.ratemds.com/doctor-ratings/dr-aaron-morrow-md-greensboro-nc-us' , callback=self.profile)def profile(self, response):item =  {'url': response.request.url,'Image': response.css('.doctor-profile-image::attr(src)').get(),'First_and_Last_Name': response.css('h1::text').get()}yield item

output:

{'url': 'https://www.ratemds.com/doctor-ratings/dr-aaron-morrow-md-greensboro-nc-us', 'Image': None, 'First_and_Last_Name': None}

Question 2

The problem is that this website has captcha protection. And when you try to collect information from it you are redirecting to the page, like this one: error_page

and as you can see this page not contains information which you are looking for. To collect information from such website you can try the following:

Use scrapy-selenium/splash to collect information.
use captcha solving tools like death-by-captcha , anticaptcha or similar.

Scrapy empty output

Related Q&A

nested classes - how to use function from parent class?

CUDA Function Wont Execute For Loop on Python with Numba

Implementing the Ceaser Cipher function through input in Python

Twitter scraping of older tweets

Bootstrap Navbar Logo not found

Why camelcase not installed?

Find the two longest strings from a list || or the second longest list in PYTHON

Which tensorflow-gpu version is compatible with Python 3.7.3

Find valid strings [closed]

What is the difference in *args, **kwargs vs calling with tuple and dict? [duplicate]