如何将转换后的文件保存到 Amazon S3 中的同一源目录,并使用此 python 代码进行媒体转换



我有来自亚马逊样本的convert.py代码,用于将转换后的视频保存在 S3 存储桶上。我不熟悉Python代码。如何更改代码,使其保存到同一目录而不是'assets/' + assetID + '/MP4/' + sourceS3Basename

#!/usr/bin/env python
import glob
import json
import os
import uuid
import boto3
import datetime
import random
from botocore.client import ClientError
def handler(event, context):
assetID = str(uuid.uuid4())
sourceS3Bucket = event['Records'][0]['s3']['bucket']['name']
sourceS3Key = event['Records'][0]['s3']['object']['key']
sourceS3 = 's3://'+ sourceS3Bucket + '/' + sourceS3Key
sourceS3Basename = os.path.splitext(os.path.basename(sourceS3))[0]
destinationS3 = 's3://' + os.environ['DestinationBucket']
destinationS3basename = os.path.splitext(os.path.basename(destinationS3))[0]
mediaConvertRole = os.environ['MediaConvertRole']
region = os.environ['AWS_DEFAULT_REGION']
statusCode = 200
body = {}
# Use MediaConvert SDK UserMetadata to tag jobs with the assetID 
# Events from MediaConvert will have the assetID in UserMedata
jobMetadata = {'assetID': assetID}
print (json.dumps(event))
try:
# Job settings are in the lambda zip file in the current working directory
with open('job.json') as json_data:
jobSettings = json.load(json_data)
print(jobSettings)
# get the account-specific mediaconvert endpoint for this region
mc_client = boto3.client('mediaconvert', region_name=region)
endpoints = mc_client.describe_endpoints()
# add the account-specific endpoint to the client session 
client = boto3.client('mediaconvert', region_name=region, endpoint_url=endpoints['Endpoints'][0]['Url'], verify=False)
# Update the job settings with the source video from the S3 event and destination 
# paths for converted videos
jobSettings['Inputs'][0]['FileInput'] = sourceS3
S3KeyHLS = 'assets/' + assetID + '/HLS/' + sourceS3Basename
jobSettings['OutputGroups'][0]['OutputGroupSettings']['HlsGroupSettings']['Destination'] 
= destinationS3 + '/' + S3KeyHLS
S3KeyWatermark = 'assets/' + assetID + '/MP4/' + sourceS3Basename
jobSettings['OutputGroups'][1]['OutputGroupSettings']['FileGroupSettings']['Destination'] 
= destinationS3 + '/' + S3KeyWatermark
S3KeyThumbnails = 'assets/' + assetID + '/Thumbnails/' + sourceS3Basename
jobSettings['OutputGroups'][2]['OutputGroupSettings']['FileGroupSettings']['Destination'] 
= destinationS3 + '/' + S3KeyThumbnails     
print('jobSettings:')
print(json.dumps(jobSettings))
# Convert the video using AWS Elemental MediaConvert
job = client.create_job(Role=mediaConvertRole, UserMetadata=jobMetadata, Settings=jobSettings)
print (json.dumps(job, default=str))
except Exception as e:
print ('Exception: %s' % e)
statusCode = 500
raise
finally:
return {
'statusCode': statusCode,
'body': json.dumps(body),
'headers': {'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*'}
}

您可以使用相同的 S3 源存储桶代替 S3 目标存储桶。 您可以使用正则表达式 (regex(从其密钥中提取 S3 对象的基本路径。

下面是您的代码,其中包含用于完成您的要求的增强功能:

#!/usr/bin/env python
import glob
import json
import os
import uuid
import boto3
import datetime
import random
import re
from botocore.client import ClientError
def handler(event, context):
assetID = str(uuid.uuid4())
sourceS3Bucket = event['Records'][0]['s3']['bucket']['name']
sourceS3Key = event['Records'][0]['s3']['object']['key']
sourceS3 = 's3://'+ sourceS3Bucket + '/' + sourceS3Key
sourceS3Basename = os.path.splitext(os.path.basename(sourceS3))[0]
destinationS3 = 's3://'+ sourceS3Bucket
sourceS3KeyPath = re.findall('^(.*)/', str(sourceS3Key))
if len(sourceS3KeyPath) == 1:
destinationS3 = destinationS3 + '/' + str(sourceS3KeyPath[0])
destinationS3basename = sourceS3Basename
mediaConvertRole = os.environ['MediaConvertRole']
region = os.environ['AWS_DEFAULT_REGION']
statusCode = 200
body = {}
# Use MediaConvert SDK UserMetadata to tag jobs with the assetID 
# Events from MediaConvert will have the assetID in UserMedata
jobMetadata = {'assetID': assetID}
print (json.dumps(event))
try:
# Job settings are in the lambda zip file in the current working directory
with open('job.json') as json_data:
jobSettings = json.load(json_data)
print(jobSettings)
# get the account-specific mediaconvert endpoint for this region
mc_client = boto3.client('mediaconvert', region_name=region)
endpoints = mc_client.describe_endpoints()
# add the account-specific endpoint to the client session 
client = boto3.client('mediaconvert', region_name=region, endpoint_url=endpoints['Endpoints'][0]['Url'], verify=False)
# Update the job settings with the source video from the S3 event and destination 
# paths for converted videos
jobSettings['Inputs'][0]['FileInput'] = sourceS3
S3KeyHLS = 'hls'
jobSettings['OutputGroups'][0]['OutputGroupSettings']['HlsGroupSettings']['Destination'] 
= destinationS3 + '/' + S3KeyHLS
S3KeyWatermark = 'watermark'
jobSettings['OutputGroups'][1]['OutputGroupSettings']['FileGroupSettings']['Destination'] 
= destinationS3 + '/' + S3KeyWatermark
S3KeyThumbnails = 'thumbnails'
jobSettings['OutputGroups'][2]['OutputGroupSettings']['FileGroupSettings']['Destination'] 
= destinationS3 + '/' + S3KeyThumbnails     
print('jobSettings:')
print(json.dumps(jobSettings))
# Convert the video using AWS Elemental MediaConvert
job = client.create_job(Role=mediaConvertRole, UserMetadata=jobMetadata, Settings=jobSettings)
print (json.dumps(job, default=str))
except Exception as e:
print ('Exception: %s' % e)
statusCode = 500
raise
finally:
return {
'statusCode': statusCode,
'body': json.dumps(body),
'headers': {'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*'}
}

欢迎来到堆栈溢出!

为了在此处提供有关代码的一些上下文 - 在此块的脚本顶部定义了数据的源存储桶和目标存储桶。(行首#是 Python 的注释运算符,我添加了它们以帮助更好地理解代码(。

# The S3 bucket for the source is pulled in from an event
# (I assume it's when a new object is added to an S3 bucket, which would trigger this script)
sourceS3Bucket = event['Records'][0]['s3']['bucket']['name']
sourceS3Key = event['Records'][0]['s3']['object']['key']
sourceS3 = 's3://'+ sourceS3Bucket + '/' + sourceS3Key
sourceS3Basename = os.path.splitext(os.path.basename(sourceS3))[0]
# The S3 bucket for the destination is pulled in through an environment variable using the os.environ() method
destinationS3 = 's3://' + os.environ['DestinationBucket']
destinationS3basename = os.path.splitext(os.path.basename(destinationS3))[0]

有几种方法可以进行更改以实现您正在寻找的内容。示例代码的编写方式是,它可以接受任何 S3 存储桶的输入并解析事件以查找相关文件,而目标存储桶似乎是从其他地方定义的环境变量(可能是配置文件(中读入的。

看起来每个作业(HLS、水印和缩略图(都创建了 3 个对象,因此您可能需要确保没有可能导致覆盖的文件名冲突(源文件和已处理文件之间,以及三个不同输出之间(。原始代码的工作方式是,每个处理作业创建自己的文件夹,每个子文件夹都包含转换作业的 3 个输出之一。

我建议不要以无序方式直接转储 S3 存储桶中的所有文件,但如果您设置了它,此代码块将从中更改:

S3KeyHLS = 'assets/' + assetID + '/HLS/' + sourceS3Basename 
jobSettings['OutputGroups'][0]['OutputGroupSettings']['HlsGroupSettings']['Destination']  = destinationS3 + '/' + S3KeyHLS
S3KeyWatermark = 'assets/' + assetID + '/MP4/' + sourceS3Basename
jobSettings['OutputGroups'][1]['OutputGroupSettings']['FileGroupSettings']['Destination']  = destinationS3 + '/' + S3KeyWatermark
S3KeyThumbnails = 'assets/' + assetID + '/Thumbnails/' + sourceS3Basename
jobSettings['OutputGroups'][2]['OutputGroupSettings']['FileGroupSettings']['Destination']  = destinationS3 + '/' + S3KeyThumbnails 

对此(我在三个输出中的每一个的末尾都添加了一个附加字符串(:

S3KeyHLS = sourceS3 + assetID + 'key'
jobSettings['OutputGroups'][0]['OutputGroupSettings']['HlsGroupSettings']['Destination']  = S3KeyHLS
S3KeyWatermark = sourceS3 + assetID + 'watermark'
jobSettings['OutputGroups'][1]['OutputGroupSettings']['FileGroupSettings']['Destination']  = S3KeyWatermark
S3KeyThumbnails = sourceS3 + assetID + 'thumbnail'
jobSettings['OutputGroups'][2]['OutputGroupSettings']['FileGroupSettings']['Destination']  = S3KeyThumbnails

同样,我不一定建议这样做(潜在的名称[ace 冲突,因为您的输出依赖于文件名,非结构化输出将很难解析(如果您能对您的用例多谈一点,我很乐意为您提供更多建议。

最新更新