Using my answer to my question on how to download files from a public Google drive I managed in the past to download images using their IDs from a python script and Google API v3 from a public drive using the following bock of code:
from google_auth_oauthlib.flow import Flow, InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.http import MediaFileUpload, MediaIoBaseDownload
from google.auth.transport.requests import Request
import io
import re
SCOPES = ['https://www.googleapis.com/auth/drive']
CLIENT_SECRET_FILE = "myjson.json"
authorized_port = 6006 # authorize URI redirect on the console
flow = InstalledAppFlow.from_client_secrets_file(CLIENT_SECRET_FILE, SCOPES)
cred = flow.run_local_server(port=authorized_port)
drive_service = build("drive", "v3", credentials=cred)
regex = "(?<=https://drive.google.com/file/d/)[a-zA-Z0-9]+"
for i, l in enumerate(links_to_download):url = lfile_id = re.search(regex, url)[0]request = drive_service.files().get_media(fileId=file_id)fh = io.FileIO(f"file_{i}", mode='wb')downloader = MediaIoBaseDownload(fh, request)done = Falsewhile done is False:status, done = downloader.next_chunk()print("Download %d%%." % int(status.progress() * 100))
In the mean time I discovered pydrive and pydrive2, two wrappers around Google API v2 that allows to do very useful things such as listing files from folders and basically allows to do the same thing with a lighter syntax:
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import io
import re
CLIENT_SECRET_FILE = "client_secrets.json"gauth = GoogleAuth()
gauth.LocalWebserverAuth()
drive = GoogleDrive(gauth)
regex = "(?<=https://drive.google.com/file/d/)[a-zA-Z0-9]+"
for i, l in enumerate(links_to_download):url = lfile_id = re.search(regex, url)[0]file_handle = drive.CreateFile({'id': file_id})file_handle.GetContentFile(f"file_{i}")
However now whether I use pydrive or the raw API I cannot seem to be able to download the same files and instead I am met with:
googleapiclient.errors.HttpError: <HttpError 404 when requesting https://www.googleapis.com/drive/v3/files/fileID?alt=media returned "File not found: fileID.". Details: "[{'domain': 'global', 'reason': 'notFound', 'message': 'File not found: fileID.', 'locationType': 'parameter', 'location': 'fileId'}]">
I tried everything and registered 3 different apps using Google console it seems it might be (or not) a question of scoping (see for instance this answer, with apps having access to only files in my Google drive or created by this app). However I did not have this issue before (last year).
When going to the Google console explicitly giving https://www.googleapis.com/auth/drive
as a scope to the API mandates filling a ton of fields with application's website/conditions of use/confidentiality rules/authorized domains and youtube videos explaining the app. However I will be the sole user of this script.
So I could only give explicitly the following scopes:
/auth/drive.appdata
/auth/drive.file
/auth/drive.install
Is it because of scoping ? Is there a solution that doesn't require creating a homepage and a youtube video ?
EDIT 1:
Here is an example of links_to_download
:
links_to_download = ["https://drive.google.com/file/d/fileID/view?usp=drivesdk&resourcekey=0-resourceKeyValue"]
EDIT 2: It is super instable sometimes it works without a sweat sometimes it doesn't. When I relaunch the script multiple times I get different results. Retry policies are working to a certain extent but sometimes it fails multiple times for hours.