无法将多个数据帧保存到单个csv中,只保存最后一个df



我试图从单个网页中抓取多个表,但无法将其保存到.csv文件中。下面保存的最后一张表是代码,请建议

import time
from selenium import webdriver
import pandas as pd
base_url = 'https://uk.insight.com/en_GB/shop/product/2W1F2EA%23ABU/HEWLETT-PACKARD-(HP-INC)/2W1F2EA%23ABU/HP-ProBook-440-G8--14"--Core-i7-1165G7--16-GB-RAM--1-TB-SSD--UK/'
print('Opening Chrome Browser Automatically in 5 secs')
time.sleep(5)
options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)
driver = webdriver.Chrome(options=options)
driver.get(base_url)
df = pd.read_html(driver.page_source)
df2 = df[4:]
for table in df2:
df = pd.DataFrame(table)
df.to_csv('table.csv',index=False)

我不知道如何将所有数据帧保存到单个.csv中,如上所述,只有最后一个df被保存。

在Pandas.to_csv((文档中,您可以使用mode参数来附加数据,而不是覆盖。默认设置为"w"。

如果你想附加数据,你可以把模式切换到";a";

df.to_csv('table.csv', mode='a', index=False)

需要注意的一点是,除非您设置了header = False,否则列名也将被附加

下面是一个可快速复制的例子。

import uuid
import pandas as pd
dataframe = pd.DataFrame({
"person_id": [str(uuid.uuid4())[:7] for _ in range(6)],
"hours_worked": [38.5, 41.25, "35.0", 27.75, 22.25, -20.5],
"wage_per_hour": [15.1, 15, 21.30, 17.5, 19.50, 25.50],
})

dataframe2 = pd.DataFrame({
"person_id2": [str(uuid.uuid4())[:7] for _ in range(6)],
"hours_worked2": [38.5, 41.25, "35.0", 27.75, 22.25, -20.5],
"wage_per_hour2": [15.1, 15, 21.30, 17.5, 19.50, 25.50],
})
dataframe.to_csv('TEST.csv', mode='w', index=False)
dataframe2.to_csv('TEST.csv', mode='a', index = False, header=False)
print(pd.read_csv('TEST.csv'))

输出

person_id  hours_worked  wage_per_hour
0    1aa66bc         38.50           15.1
1    b7abe05         41.25           15.0
2    15e1779         35.00           21.3
3    3c117d7         27.75           17.5
4    2e6494e         22.25           19.5
5    2a25e45        -20.50           25.5
6    b17d084         38.50           15.1
7    6ca361e         41.25           15.0
8    2cd18e4         35.00           21.3
9    9d120ff         27.75           17.5
10   a0b20d9         22.25           19.5
11   bf9a98d        -20.50           25.5

最新更新