How to get list_blobs to behave like gsutil

2024/9/30 23:35:56

I would like to only get the first level of a fake folder structure on GCS.

If I run e.g.:

gsutil ls 'gs://gcp-public-data-sentinel-2/tiles/' I get a list like this: gs://gcp-public-data-sentinel-2/tiles/01/ gs://gcp-public-data-sentinel-2/tiles/02/ gs://gcp-public-data-sentinel-2/tiles/03/ gs://gcp-public-data-sentinel-2/tiles/04/ gs://gcp-public-data-sentinel-2/tiles/05/ gs://gcp-public-data-sentinel-2/tiles/06/ gs://gcp-public-data-sentinel-2/tiles/07/ gs://gcp-public-data-sentinel-2/tiles/08/ gs://gcp-public-data-sentinel-2/tiles/09/ gs://gcp-public-data-sentinel-2/tiles/10/ gs://gcp-public-data-sentinel-2/tiles/11/ gs://gcp-public-data-sentinel-2/tiles/12/ gs://gcp-public-data-sentinel-2/tiles/13/ gs://gcp-public-data-sentinel-2/tiles/14/ gs://gcp-public-data-sentinel-2/tiles/15/ . . .

Running code like the following in the Python API give me an empty result:

from google.cloud import storage
bucket_name = 'gcp-public-data-sentinel-2'
prefix = 'tiles/'
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
for blob in bucket.list_blobs(max_results=10, prefix=prefix,delimiter='/'):print blob.name

If I don't use the delimiter option I get all the results in the bucket which is not very useful.

Answer

Maybe not the best way, but, inspired by this comment on the official repository:

iterator = bucket.list_blobs(delimiter='/', prefix=prefix)
response = iterator._get_next_page_response()
for prefix in response['prefixes']:print('gs://'+bucket_name+'/'+prefix)

Gives:

gs://gcp-public-data-sentinel-2/tiles/01/
gs://gcp-public-data-sentinel-2/tiles/02/
gs://gcp-public-data-sentinel-2/tiles/03/
gs://gcp-public-data-sentinel-2/tiles/04/
gs://gcp-public-data-sentinel-2/tiles/05/
gs://gcp-public-data-sentinel-2/tiles/06/
gs://gcp-public-data-sentinel-2/tiles/07/
gs://gcp-public-data-sentinel-2/tiles/08/
gs://gcp-public-data-sentinel-2/tiles/09/
gs://gcp-public-data-sentinel-2/tiles/10/
...
https://en.xdnf.cn/q/71030.html

Related Q&A

How to avoid gcc warning in Python C extension when using Py_BEGIN_ALLOW_THREADS

The simplest way to manipulate the GIL in Python C extensions is to use the macros provided:my_awesome_C_function() {blah;Py_BEGIN_ALLOW_THREADS// do stuff that doesnt need the GILif (should_i_call_ba…

Pairwise operations (distance) on two lists in numpy

I have two lists of coordinates:l1 = [[x,y,z],[x,y,z],[x,y,z],[x,y,z],[x,y,z]] l2 = [[x,y,z],[x,y,z],[x,y,z]]I want to find the shortest pairwise distance between l1 and l2. Distance between two coordi…

Bad results when undistorting points using OpenCV in Python

Im having trouble undistorting points on an image taken with a calibrated camera using the Python bindings for OpenCV. The undistorted points have entirely different coordinates than the original point…

Pandas - Groupby and create new DataFrame?

This is my situation - In[1]: data Out[1]: Item Type 0 Orange Edible, Fruit 1 Banana Edible, Fruit 2 Tomato Edible, Vegetable 3 Laptop Non Edible, Elec…

Can pytest fixtures and confest.py modules be shared across packages?

Suppose I have packageA which provides a class usefulClass, pytest fixtures in a test_stuff.py module, and test configurations in a conftest.py module. Moreover, suppose that I have packageBand package…

Sort a list by presence of items in another list

Suppose I have two lists:a = [30, 10, 90, 1111, 17] b = [60, 1201, 30, 17, 900]How would you sort this most efficiently, such that:list b is sorted with respect to a. Unique elements in b should be pla…

Sample from a multivariate t distribution python

I am wondering if there is a function for sampling from a multivariate student t-distribution in Python. I have the mean vector with 14 elements, the 14x14 covariance matrix and the degrees of freedom …

Why is this Jinja nl2br filter escaping brs but not ps?

I am attempting to implement this Jinja nl2br filter. It is working correctly except that the <br>s it adds are being escaped. This is weird to me because the <p>s are not being escaped and…

select certain monitor for going fullscreen with gtk

I intend to change the monitor where I show a fullscreen window. This is especially interesting when having a projector hooked up.Ive tried to use fullscreen_on_monitor but that doesnt produce any visi…

Load Excel add-in using win32com from Python

Ive seen from various questions on here that if an instance of Excel is opened from Python using:xl = win32com.client.gencache.EnsureDispatch(Excel.Application) xl.Visible = True wb = xl.Workbooks.Open…