I want to constantly scrape a website - once every 3-5 seconds with
requests.get('http://www.example.com', headers=headers2, timeout=35).json()
But the example website has a rate limit and I want to bypass that. How can I do so?? I thought about doing it with proxies but was hoping there were some other ways?
You would have to do some very low level stuff. Utilizing likely socket and urllib2.
First do your research. How are they limiting your query rate? Is it by IP, or session based (server side cookie) or local cookies? I suggest going to the site manually as your first step of research, and using a web-developer tool to view all headers communicated.
One you figure this out, create a plan to manipulate it.
Lets say it is session based, you could utilize multiple threads to control several individual instances of a scraper, each with unique sessions.
Now, if it is IP based, then you must spoof your IP which is much more complex.