如何在S3中提取csv到json的元素

我需要从文件夹中找到csv文件
列出文件夹中的所有文件
将文件转换为json并保存在同一个bucket中

Csv文件，像下面这样有很多Csv文件

emp_id,Name,Company
10,Aka,TCS
11,VeI,TCS

代码低于

import boto3
import pandas as pd
def lambda_handler(event, context):
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('testfolder')
for file in my_bucket.objects.all():
print(file.key)
for csv_f in file.key:
with open(f'{csv_f.replace(".csv", ".json")}', "w") as f:
pd.read_csv(csv_f).to_json(f, orient='index')

如果删除将保存在文件夹中的存储桶名称，则无法保存。如何保存回存储桶名称

您可以检查以下代码：

from io import StringIO
import boto3
import pandas as pd
s3 = boto3.resource('s3')
def lambda_handler(event, context):

s3 = boto3.resource('s3')

input_bucket = 'bucket-with-csv-file-44244'

my_bucket = s3.Bucket(input_bucket)

for file in my_bucket.objects.all():

if file.key.endswith(".csv"):

csv_f = f"s3://{input_bucket}/{file.key}"

print(csv_f)

json_file = file.key.replace(".csv", ".json")

print(json_file)

json_buffer = StringIO()

df = pd.read_csv(csv_f)

df.to_json(json_buffer, orient='index')

s3.Object(input_bucket, json_file).put(Body=json_buffer.getvalue())

您的lambda层需要具有：

fsspec
pandas
s3fs

相关内容

最新更新

热门标签：