Python login page with pop up windows

2024/10/15 12:36:59

I want to access webpages and print the source codes with python, most of them require login at first place. I have similar problem before and I have solved it with the following code, because they are fix fields on the webpage for me to locate them. Recently, I need to access another page, but this time, there is pop-up login window and I can't use the same method to solve the problem.

I have tried to use Selenium module, but it will require to open up the browser and do the trick, just wondering if there is similar method to cookielib for the python run the code at the background without noticing the browser has been opened? Many thanks!

import cookielib
import urllib
import urllib2# Store the cookies and create an opener that will hold them
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))# Add our headers
opener.addheaders = [('User-agent', 'RedditTesting')]# Install our opener (note that this changes the global opener to the one
# we just made, but you can also just call opener.open() if you want)
urllib2.install_opener(opener)# The action/ target from the form
authentication_url = 'https://ssl.reddit.com/post/login'# Input parameters we are going to send
payload = {'op': 'login-main','user': '<username>','passwd': '<password>'}# Use urllib to encode the payload
data = urllib.urlencode(payload)# Build our Request object (supplying 'data' makes it a POST)
req = urllib2.Request(authentication_url, data)# Make the request and read the response
resp = urllib2.urlopen(req)
contents = resp.read()

enter image description here

Answer

You can use selenium with PhantomJS to have an headless browser. There is also Ghost.py that use WebKit to interpret the Javascript. This two projects help to interact with the js content of the webapps.

But I notice that the pop-up is due to an HTTP authentification protocol, here it seems to be https://en.wikipedia.org/wiki/NT_LAN_Manager

So you may want to take a look at this protocol and create a request based on that, instead of trying to put your logins in the pop-up.

https://en.xdnf.cn/q/117833.html

Related Q&A

Calculate Scipy LOGNORM.CDF() and get the same answer as MS Excel LOGNORM.DIST

I am reproducing a chart in a paper using the LOGNORM.DIST in Microsoft Excel 2013 and would like to get the same chart in Python. I am getting the correct answer in excel, but not in python.In excel …

Python MySQLdb cursor.execute() insert with varying number of values

Similar questions have been asked, but all of them - for example This One deals only with specified number of values.for example, I tried to do this the following way:def insert_values(table, columns, …

Searching in a .txt file and Comparing the two values of a string in python?

"cadence_regulatable_result": "completeRecognition","appserver_results": {"status": "success","final_response": 0,"payload": {"…

How to perform an HTTP/XML authentication with requests

I am trying to authenticate to Docushare with Python 3.4 using requests 2.7. I am relatively new to Python and to the requests module but Ive done a lot of reading and am not able to make any more prog…

webPy Sessions - Concurrent users use same session and session timeout

I have a webPy app using sessions for user authentication. Sessions are initiated like so:web.config.debug=Falsestore = web.session.DiskStore(/path_to_app/sessions) if web.config.get(_session) is None:…

get text content from p tag

I am trying to get description text content of each block on this page https://twitter.com/search?q=data%20mining&src=typd&vertical=default&f=users. html for p tag looks like<p class=&q…

Python - making a function that would add - between letters

Im trying to make a function, f(x), that would add a "-" between each letter:For example:f("James")should output as:J-a-m-e-s-I would love it if you could use simple python function…

python script keeps converting dates to utc

I have the following:import psycopg2 from openpyxl import Workbook wb = Workbook() wb.active =0 ws = wb.active ws.title = "Repair" ws.sheet_properties.tabColor = "CCFFCC"print(wb.sh…

sklearn tsne with sparse matrix

Im trying to display tsne on a very sparse matrix with precomputed distances values but Im having trouble with it.It boils down to this:row = np.array([0, 2, 2, 0, 1, 2]) col = np.array([0, 0, 1, 2, 2,…

Removing a sublist from a list

I have a list e.g. l1 = [1,2,3,4] and another list: l2 = [1,2,3,4,5,6,7,1,2,3,4]. I would like to check if l1 is a subset in l2 and if it is, then I want to delete these elements from l2 such that l2 …