Question 1

So basically I'm trying to scrap the javascript generated data from a website. To do this, I'm using the Python library requests_html.

Here is my code :

from requests_html import HTMLSession
session = HTMLSession()url = 'https://myurl'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
payload = {'mylog': 'root', 'mypass': 'root'}r = session.post(url, headers=headers, verify=False, data=payload)
r.html.render()
load = r.html.find('#load_span', first=True)print (load.text)

If I don't use the render() function, I can connect to the website and my scraped data is null (which is normal) but when I use it, I have this error :

pyppeteer.errors.PageError: net::ERR_CERT_COMMON_NAME_INVALID at https://myurl

or

net::ERR_CERT_WEAK_SIGNATURE_ALGORITHM

I assume the parameter "verify=False" of session.post is ignored by the render. How do I do it ?

Edit : If you want to reproduce the error :

from requests_html import HTMLSession
import requestssession = HTMLSession()url = 'https://wrong.host.badssl.com'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}r = session.post(url, headers=headers, verify=False)r.html.render()load = r.html.find('#content', first=True)print (load)

Question 2

The only way is to set the ignoreHTTPSErrors parameter in pyppeteer. The problem is that requests_html doesn't provide any way to set this parameter, in fact, there is an issue about it. My advice is to ping again the developers by adding another message here.

Or maybe you can pull this new feature.

Another way is to use Selenium.

EDIT:
I added verify=False as a feature with a pull request (accepted). Now is possible to ignore the SSL error :)

It's not a parameter of the Get() set it when you instantiate the object:

session = HTMLSession(verify=False)

How to ignore an invalid SSL certificate with requests_html?

Related Q&A

Fabric asks for root password

Beautifulsoup results to pandas dataframe

XGBoost CV and best iteration

Whats the correct way to implement a metaclass with a different signature than `type`?

Python -- Regex -- How to find a string between two sets of strings

Kivy TextInput horizontal and vertical align (centering text)

How to capture python SSL(HTTPS) connection through fiddler2

removing leading 0 from matplotlib tick label formatting

How do I check if an iterator is actually an iterator container?

Python Terminated Thread Cannot Restart