我有一个有很多子文件夹的bucket,我用这个函数来获取Blob,但我需要获取文件名,这怎么能做到?
def list_blobs_with_prefix(bucket_name, prefix,delimiter=None):
storage_client = storage.Client()
blobs = storage_client.list_blobs(bucket_name, prefix=prefix,delimiter=delimiter)
return blobs
我需要在没有文件夹路由的情况下获得文件名,要在另一个函数中使用它,该函数会尝试下载文件并将其放在临时文件夹中。
此脚本将为您提供一个列表,其中仅包括bucket中的文件名,而不包括文件夹/子文件夹/路由
from google.cloud import storage
client = storage.Client()
BUCKET_NAME = 'thehotbucket'
bucket = client.get_bucket(BUCKET_NAME)
blobs = bucket.list_blobs()
for blob in blobs:
try:
num = blob.name.count('/')
string = blob.name.split('/')[num]
if string != "":
print(string)
except:
print("An exception occurred")
给定字符串路径的任何列表(这不仅适用于您的应用程序(,文件可以如下所示:
from pathlib import Path
def get_files(*paths: str, *, sep='/') -> list[str]:
return [path.rpartition(sep)[-1] for path in paths]
def get_files(*paths: str) -> list[str]:
return [Path(path).name for path in paths]
对于您的具体情况,此更改为:
from pathlib import Path
from typing import Iterable
def get_files(blobs: Iterable) -> list[str]:
return [blob.name.rpartition('/')[-1] for blob in blobs]
def get_files(blobs: Iterable) -> list[str]:
return [Path(blob.name).name for blob in blobs]
此外,您可以很容易地从名称中去掉文件后缀:
from pathlib import Path
from typing import Iterable
def get_files(blobs: Iterable) -> list[str]:
paths = [Path(blob.name) for blob in blobs]
return [path.name.rpartition(path.suffix)[0] for path in paths]