列出开始日期(含)和结束日期(不含)之间的所有S3密钥



有没有办法列出指定日期之间的所有s3文件。开始日期可以作为前缀传递。我弄不清楚如何打发结束日期。请帮忙。

import boto3

def get_matching_s3_objects(bucket, prefix=''):
"""
Generate objects in an S3 bucket.
:param bucket: Name of the S3 bucket.
:param prefix: Only fetch objects whose key starts with
this prefix
"""
s3 = boto3.client('s3')
kwargs = {'Bucket': bucket}
if isinstance(prefix, str):
kwargs['Prefix'] = prefix
while True:
# The S3 API response is a large blob of metadata.
# 'Contents' contains information about the listed objects.
resp = s3.list_objects_v2(**kwargs)
try:
contents = resp['Contents']
except KeyError:
return
for obj in contents:
key = obj['Key']
if key.startswith(prefix) and key.endswith(suffix):
yield obj
# The S3 API is paginated, returning up to 1000 keys at a time.
# Pass the continuation token into the next response, until we
# reach the final page (when this field is missing).
try:
kwargs['ContinuationToken'] = resp['NextContinuationToken']
except KeyError:
break

def get_matching_s3_keys(bucket, prefix=''):
"""
Generate the keys in an S3 bucket.
:param bucket: Name of the S3 bucket.
:param prefix: Only fetch keys that start with this prefix (optional).
:param suffix: Only fetch keys that end with this suffix (optional).
"""
for obj in get_matching_s3_objects(bucket, prefix, suffix):
yield obj['Key']

AFAIK使用boto3无法直接按日期进行筛选,唯一可用的筛选器是BucketDelimiterEncodingTypeMarkerMaxKeysPrefixRequestPayer

因此,您需要循环键/对象,以将您的开始/结束日期与对象last_modified日期时间值进行比较,因此,要获取一周前(包括(和今天(不包括(之间特定存储桶中的所有对象,我将执行类似的操作

from datetime import datetime, timedelta
import boto3
from pytz import UTC as utc
# NOTE: We need timezone aware objects, because the s3 object one will be.
today = utc.localize(datetime.utcnow())
since = today - timedelta(weeks=1)
# WARNINGS: 
# - You may need to provide proper credentials when calling boto3.resource...
# - Error management will need to be added, in case the bucket doesn't exist.
keys = [
o for o in boto3.resource('s3').Bucket(name='some_bucket').objects.all()
if o.last_modified < today and o.last_modified >= since
]

最新更新