查找 S3 存储桶的 1 级前缀大小,同时包括使用 BOTO3 和 Python 的版本



我是一个aws python新手,试图计算通过UI上的metrics选项卡显示的总存储桶大小,而不是在给定的存储桶中一次计算一个文件夹的大小。我试图通过设置库存配置来获取它,但它没有显示我要找的东西。

我有一个s3存储桶名称my_bucket,并启用了版本控制
它有100个对象和26个子文件夹(每个子文件夹中有100000多个对象,每个对象至少有两个版本(

我正在尝试做的事情:计算并显示总大小,包括180个子文件夹中每个子文件夹的版本。

A  Size 1GB  
B  Size 10TB    
.  
.  
.  
Z Size 13TB

我是如何尝试的找到一个解决方案,该解决方案结合了
来自链接一的基于配置文件的身份验证,并使用bucket.object_versions
和来自链接2的一级文件夹大小计算
,同时还考虑了版本。(Link2没有版本(

链接1https://stackoverflow.com/a/58125684/4590025
链接2https://stackoverflow.com/a/49763268/4590025

import boto3
PROFILE = "my_profile"
BUCKET = "my_bucket"
session = boto3.Session(profile_name = PROFILE)
s3 = session.resource('s3')
bucket = s3.Bucket(BUCKET)
#bucket.object_versions.do_something_with_it

conn = boto3.client('s3')
top_level_folders = dict()
for key in conn.list_objects(Bucket='my_bucket')['Contents']:
folder = key['Key'].split('/')[0]
print("Key %s in folder %s. %d bytes" % (key['Key'], folder, key['Size']))
if folder in top_level_folders:
top_level_folders[folder] += key['Size']
else:
top_level_folders[folder] = key['Size']

for folder, size in top_level_folders.items():
print("Folder: %s, size: %d" % (folder, size))

我还提到https://stackoverflow.com/a/48867829我不知道如何利用这两个,目前当我运行它时,尽管设置了会话,我还是收到了以下错误:

Traceback (most recent call last):
File ".folder_size.py", line 17, in <module>
for key in conn.list_objects(Bucket='my_bucket')['Contents']:
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocoreclient.py", line 316, in _api_call
return self._make_api_call(operation_name, kwargs)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocoreclient.py", line 622, in _make_api_call
operation_model, request_dict, request_context)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocoreclient.py", line 641, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocoreendpoint.py", line 102, in make_request
return self._send_request(request_dict, operation_model)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocoreendpoint.py", line 132, in _send_request
request = self.create_request(request_dict, operation_model)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocoreendpoint.py", line 116, in create_request
operation_name=operation_model.name)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocorehooks.py", line 356, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocorehooks.py", line 228, in emit
return self._emit(event_name, kwargs)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocorehooks.py", line 211, in _emit
response = handler(**kwargs)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocoresigners.py", line 90, in handler
return self.sign(operation_name, request)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocoresigners.py", line 160, in sign
auth.add_auth(request)
File "C:UsersgingerAppDataLocalProgramsPythonPython37libsite-packagesbotocoreauth.py", line 357, in add_auth
raise NoCredentialsError
botocore.exceptions.NoCredentialsError: Unable to locate credentials
PS C:Usersgingertest>

问题是程序使用:

conn = boto3.client('s3')

这将忽略先前设置的配置文件:

session = boto3.Session(profile_name = PROFILE)

因此,如果您想创建一个带有配置文件的S3客户端,那么它应该使用:

conn = session.client('s3')

为了避免分页问题,您可以使用resource方法来检索所有对象:

for object in bucket.objects.all():
folder = object.key.split('/')[0]
print("Key %s in folder %s. %d bytes" % (object.key, folder, object.size))
...

最新更新