API capture all paginated data? (python)

2024/10/3 2:26:29

I'm using the requests package to hit an API (greenhouse.io). The API is paginated so I need to loop through the pages to get all the data I want. Using something like:

results = []
for i in range(1,326+1):response = requests.get(url, auth=(username, password), params={'page':i,'per_page':100})if response.status_code == 200:results += response.json()

I know there are 326 pages by hitting the headers attribute:

In [8]:
response.headers['link']
Out[8]:
'<https://harvest.greenhouse.io/v1/applications?page=3&per_page=100>; rel="next",<https://harvest.greenhouse.io/v1/applications?page=1&per_page=100>; rel="prev",<https://harvest.greenhouse.io/v1/applications?page=326&per_page=100>; rel="last"'

Is there any way to extract this number automatically? Using the requests package? Or do I need to use regex or something?

Alternatively, should I somehow use a while loop to get all this data? What is the best way? Any thoughts?

Answer

The python requests library (http://docs.python-requests.org/en/latest/) can help here. The basic steps will be (1) all the request and grab the links from the header (you'll use this to get that last page info), and then (2) loop through the results until you're at that last page.

import requestsresults = []response = requests.get('https://harvest.greenhouse.io/v1/applications', auth=('APIKEY',''))
raw = response.json()  for i in raw:  results.append(i) while response.links['next'] != response.links['last']:  r = requests.get(response.links['next'], auth=('APIKEY', '')  raw = r.json()  for i in raw:  results.append(i)
https://en.xdnf.cn/q/70778.html

Related Q&A

How to convert latitude longitude to decimal in python?

Assuming I have the following:latitude = "20-55-70.010N" longitude = "32-11-50.000W"What is the easiest way to convert to decimal form? Is there some library?Would converting from…

No module named main, wkhtmltopdf issue

Im new in python, but all search results i found was useless for me.C:\Users\Aero>pip install wkhtmltopdf Collecting wkhtmltopdfUsing cached wkhtmltopdf-0.2.tar.gz Installing collected packages: wkh…

Is there a Python shortcut for variable checking and assignment?

Im finding myself typing the following a lot (developing for Django, if thats relevant):if testVariable then:myVariable = testVariable else:# something elseAlternatively, and more commonly (i.e. buildi…

python scipy Delaunay plotting point cloud

I have a pointlist=[p1,p2,p3...] where p1 = [x1,y1],p2=[x2,y2] ...I want to use scipy.spatial.Delaunay to do trianglation on these point clouds and then plot itHow can i do this ?The documentation for…

Pythonic way to verify parameter is a sequence but not string

I have a function that gets a list of DB tables as parameter, and returns a command string to be executed on these tables, e.g.:pg_dump( file=/tmp/dump.sql,tables=(stack, overflow),port=5434name=europe…

How to get a random (bootstrap) sample from pandas multiindex

Im trying to create a bootstrapped sample from a multiindex dataframe in Pandas. Below is some code to generate the kind of data I need.from itertools import product import pandas as pd import numpy a…

Python Regex - replace a string not located between two specific words

Given a string, I need to replace a substring with another in an area not located between two given words.For example:substring: "ate" replace to "drank", 1st word - "wolf"…

Vectorized Lookups of Pandas Series to a Dictionary

Problem Statement:A pandas dataframe column series, same_group needs to be created from booleans according to the values of two existing columns, row and col. The row needs to show True if both cells …

Why cant I get my static dir to work with django 1.3?

This problem is very simple, but I just cant figure it outadded to my urlpatternsurl(r^static/(?P<path>.*)$, django.views.static.serve, {document_root: /home/user/www/site/static})where my main.…

Desktop Launcher for Python Script Starts Program in Wrong Path

I can not launch a python script from a .desktop launcher created on Linux Mint 17.1 Cinnamon.The problem is that the script will be launched in the wrong path - namely the home folder instead of the d…