Download an Image Using Selenium Webdriver in Python

2024/9/20 15:16:43

I am trying to download an image from a URL using Selenium Webdriver in Python. The site is protected by a login page, so can't just save the URL contents using requests. I am able to get text from the site after logging in, but I can't figure out how to save an image.

After I log in to the site, I can do browser.save_screenshot(filename + '.png') but that image is not the correct size as the original.

The code that I have now is this:

browser = webdriver.Chrome('../chromedriver')
browser.get('www.example.com/login')
# send username and password, click submitbrowser.get('www.example.com/123')
html = browser.page_source
printData(html)# this url is an image file
browser.get('www.example.com/get_photo.php?id=123')
browser.save_screenshot(filename + '.png')

Ideally I would like to replace the save_screenshot() with something like

with open(filename + '.jpeg', 'w') as img:img.write(browser.download_current_image())

or even something like this, interacting with the popup menu

browser.right_click()
browser.down_arrow_key()
browser.return_key()

or simulating a keypress

browser.command_key()
browser.s_key()

This question gives the answers that I want, but not for Python. If there is a way to do any of the things suggested in that question (besides taking a screenshot) in Python, that would be a great solution.

Answer

Here is what I used to download an image from a URL behind a login page by logging in using Selenium Webdriver and then passing the cookies to requests to save the image:

headers = {
"User-Agent":"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"
}
s = requests.session()
s.headers.update(headers)for cookie in browser.get_cookies():c = {cookie['name']: cookie['value']}s.cookies.update(c)r = s.get(imgurl, allow_redirects=True)
open(filename + '.jpeg', 'wb').write(r.content)

Thanks to AldoSuwandi for showing me how to do this in this post. I also used this site to help me figure out how to download an image using requests.

https://en.xdnf.cn/q/72155.html

Related Q&A

Can I turn off implicit Python unicode conversions to find my mixed-strings bugs?

When profiling our code I was surprised to find millions of calls toC:\Python26\lib\encodings\utf_8.py:15(decode)I started debugging and found that across our code base there are many small bugs, usual…

jupyter: how to stop execution on errors?

The common way to defensively abort execution in python is to simply do something like: if something_went_wrong:print("Error message: goodbye cruel world")exit(1)However, this is not good pra…

Python 2.7 on Google App Engine, cannot use lxml.etree

Ive been trying to use html5lib with lxml on python 2.7 in google app engine. But when I run the following code, it gives me an error saying "NameError: global name etree is not defined". Is …

Pandas split name column into first and last name if contains one space

Lets say I have a pandas DataFrame containing names like so:name_df = pd.DataFrame({name:[Jack Fine,Kim Q. Danger,Jane Smith, Juan de la Cruz]})name 0 Jack Fine 1 Kim Q. Danger 2 Jane Smith 3 J…

Docker. No such file or directory

I have some files which I want to move them to a docker container. But at the end docker cant find a file..The folder with the files on local machine are at /home/katalonne/flask4File Structure if it m…

How to recover original values after a model predict in keras?

This is a more conceptual question, but I have to confess I have been dealing with it for a while. Suppose you want to train a neural network (NN), using for instance keras. As it is recommended you pe…

Find closest line to each point on big dataset, possibly using shapely and rtree

I have a simplified map of a city that has streets in it as linestrings and addresses as points. I need to find closest path from each point to any street line. I have a working script that does this, …

Reading pretty print json files in Apache Spark

I have a lot of json files in my S3 bucket and I want to be able to read them and query those files. The problem is they are pretty printed. One json file has just one massive dictionary but its not in…

Visualize TFLite graph and get intermediate values of a particular node?

I was wondering if there is a way to know the list of inputs and outputs for a particular node in tflite? I know that I can get input/outputs details, but this does not allow me to reconstruct the com…

Why do I get a pymongo.cursor.Cursor when trying to query my mongodb db via pymongo?

I have consumed a bunch of tweets in a mongodb database. I would like to query these tweets using pymongo. For example, I would like to query for screen_name. However, when I try to do this, python doe…