InsecureRequestWarning + MarkupResemblesLocatorWarning:

2024/9/22 18:24:10

I'd like to scrape a site for my office work. I am learning each day. I need your support guys.

Here is the Code:

url =
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0","Accept-Encoding": "\*","Connection": "keep-alive"}
requests.get(url, verify=False, headers=headers)
soup = BeautifulSoup(url,'html.parser').textprint(soup)


InsecureRequestWarning: Unverified HTTPS request is being made to host ''. Adding certificate verification is strongly advised. See:
1099: InsecureRequestWarning: Unverified HTTPS request is being made to host ''. Adding certificate verification is strongly advised. See:
57: MarkupResemblesLocatorWarning: The input looks more like a URL than markup. You may want to use an HTTP client like requests to get the document behind the URL, and feed that document to Beautiful Soup.
soup = BeautifulSoup(url,'html.parser').text

Please help me with scraping the site. I've just started coding.


InsecureRequestWarning is actually described in the warning you see in the output.
You have disabled the certificate verification (verify=False), hence made your request insecure.

You should be careful with such requests. If you want to disable this warning, see this article. Otherwise, follow the link from the warning message and read more details about the SSL verification.

Regarding BS part, you are passing the URL as a string to the constructor, instead, you should pass the content of the response.

The following code works for me (with InsecureRequestWarning):

import requests
from bs4 import BeautifulSoupurl = ""
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0","Accept-Encoding": "*","Connection": "keep-alive",
}response = requests.get(url, verify=False, headers=headers)
if response.status_code == 200:soup = BeautifulSoup(response.content, 'html.parser')# Continue with your parsing here# Continue with your parsing here# Continue with your parsing hereprint(soup.prettify())
else:print(f"Failed to retrieve the webpage. Status code: {response.status_code}")

