Bypass rate limit for requests.get

2024/10/11 1:12:21

I want to constantly scrape a website - once every 3-5 seconds with

requests.get('http://www.example.com', headers=headers2, timeout=35).json()

But the example website has a rate limit and I want to bypass that. How can I do so?? I thought about doing it with proxies but was hoping there were some other ways?

Answer

You would have to do some very low level stuff. Utilizing likely socket and urllib2.
First do your research. How are they limiting your query rate? Is it by IP, or session based (server side cookie) or local cookies? I suggest going to the site manually as your first step of research, and using a web-developer tool to view all headers communicated.

One you figure this out, create a plan to manipulate it. Lets say it is session based, you could utilize multiple threads to control several individual instances of a scraper, each with unique sessions.

Now, if it is IP based, then you must spoof your IP which is much more complex.

https://en.xdnf.cn/q/118384.html

Related Q&A

ValueError when using if commands in function

Im creating some functions that I can call use keywords to call out specific functions,import scipy.integrate as integrate import numpy as npdef HubbleParam(a, model = "None"):if model == &qu…

Python consecutive subprocess calls with adb

I am trying to make a python script to check the contents of a database via adb. The thing is that in my code,only the first subprocess.call() is executed and the rest are ignored. Since i am fairly ne…

Django Page not found(404) error (Library not found)

This is my music\urls.py code:-#/music/ url(r^/$, views.index, name=index),#/music/712/ url(r^(?P<album_id>[0-9]+)/$, views.detail, name=detail),And this is my views.py code:-def index(request):…

Django. Create object ManyToManyField error

I am trying to write tests for my models.I try to create object like this:GiftEn.objects.create(gift_id=1,name="GiftEn",description="GiftEn description",short_description="Gift…

No module named discord

Im creating a discord bot, but when I try to import discord, I am getting this error: Traceback (most recent call last):File "C:\Users\Someone\Desktop\Discord bot\bot.py", line 2, in <modu…

Calendar with tkinter (print the selected date)

I got this code online in order to create a calendar with tkinter:""" Simple calendar using ttk Treeview together with calendar and datetime classes. """ import calendar i…

Python Tkinter scrollbar in multiple tabs

I learned how to make a scrollable frame by embedding the frame in a canvas and then adding a scrollbar to it like this:def __add_widget_features(self, feat_tab):table_frame = ttk.Frame(feat_tab)table_…

ValueError: setting an array element with a sequence error is showing

I am trying to convert this column in float type from object type but it is giving this error. import pandas as pddf = pd.DataFrame({col1: [[-0.8783137, 0.05478287, -0.08827557, 0.69203985, 0.06209986]…

Are there any datetime.tzinfo implementations in C?

Ive been working on a Python library that uses a C extension module to do ISO 8601 parsing.Part of that work requires the creation of tzinfo objects, which is by far the slowest part of the parse. Call…

How to open telnet as a textfile rather than a binary file

So I was trying to use the read_until method in telnet but then ran into the error: Traceback (most recent call last): File "c:\Users\Desktop\7DTD Bot\test.py", line 44, in <module> tn.…