如何获得两行从AWS lambda for循环的结果?



我的代码适用于AWS lambda,但for循环似乎有问题。我一直在尝试,但无法弄清楚为什么我只得到1行作为数据框的输出,而不是2行。

这是我的代码-

import json
import pandas as pd
import numpy as np
import boto3
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from io import StringIO
result = []
df_test = pd.DataFrame()
url_list = ['https://www.google.com/', 'https://www.google.in/']
for i in url_list:
def lambda_handler(event, context):
options = Options()
options.binary_location = '/opt/headless-chromium'
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--single-process')
options.add_argument('--disable-dev-shm-usage')

driver = webdriver.Chrome('/opt/chromedriver',chrome_options=options)
driver.get(i)
title = driver.title
result.append(title)

#        df_test = pd.DataFrame(np.array(result).reshape(-1, 1))
df_test = pd.DataFrame(result)
bucket = 'bucketname'  # already created on S3
csv_buffer = StringIO()
df_test.to_csv(csv_buffer)

s3_resource = boto3.resource('s3')
s3_resource.Object(bucket, 'df_test.csv').put(Body=csv_buffer.getvalue())
driver.close();
driver.quit();

response = {
"statusCode": 200,
"body": "Selenium Headless Chrome Initialized" + title
}

return response

在循环中,您将在每次循环中初始化df,从而产生具有上次迭代数据的df

df_test = pd.DataFrame(result)

您可以在循环之前创建一个空DF,然后将DF - test附加到其中。

df_temp = pd.DataFrame()  # Empty DF, before the for loop
df_temp = pd.concat([df_test, df_temp], axis=1)  #replace the line where you're assigning to df_test

更正了错字,df_temp随后将具有

这两行try and advise

最新更新