Question 1

I have a Website URL Like www.example.com

I want to collect social information from this website like : facebook url (facebook.com/example ), twitter url ( twitter.com/example ) etc., if available anywhere, at any page of website.

How to complete this task, suggest any tutorials, blogs, technologies ..

Question 2

Since you don't know exactly where (on which page of the website) those link are located, you probably want to base you spider on CrawlSpider class. Such spider lets you define rules for link extraction and navigation through the website. See this minimal example:

from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractorclass MySpider(CrawlSpider):name = 'example.com'start_urls = ['http://www.example.com']rules = (Rule(LinkExtractor(allow_domains=('example.com', )), callback='parse_page', follow=True),)def parse_page(self, response):item = dict()item['page'] = response.urlitem['facebook_urls'] = response.xpath('//a[contains(@href, "facebook.com")]/@href').extract()item['twitter_urls'] = response.xpath('//a[contains(@href, "twitter.com")]/@href').extract()yield item

This spider will crawl all pages of example.com website and extract URLs containing facebook.com and twitter.com.

How to extract social information from a given website?

Related Q&A

Check if string is of nine digits then exit function in python

How to extract quotations from text using NLTK [duplicate]

takes exactly 2 arguments (1 given) when including self [closed]

scipy.optimize.curve_fit a definite integral function with scipy.integrate.quad

MAC OS - os.system(command) display nothing

Flask App will not load app.py (The file/path provided (app) does not appear to exist)

how to create from month Gtk.Calendar a week calendar and display the notes in each day in python

How to access inner attribute class from outer class?

Summing up the total based on the random number of inputs of a column

What is wrong with this Binomial Tree Backwards Induction European Call Option Pricing Function?