我试图只合并两个数据帧的某些列。但我在题目上有个错误。
import pandas as pd
planilha = pd.read_excel('campanha.xlsx', None)
pastas = list(planilha.keys())
campanha0 = pastas[0]
tabela = planilha['pluma_espuma_1'].merge(planilha['campanha_anexo3_mar_0'][['subprojet', 'Campanha', 'Pernada']], left_on='ParentGlobalID', right_on='GlobalID', how='left')
print(tabela.head())
但得到以下错误:
Traceback (most recent call last):
File "D:DADOSPythontrabalhosLogistica_rrdmteste_excel_pd.py", line 19, in <module>
tabela = planilha['pluma_espuma_1'].merge(planilha['campanha_anexo3_mar_0'][['subprojet', 'Campanha', 'Pernada']], left_on='ParentGlobalID', right_on='GlobalID', how='left')
File "C:UsersAroldoAppDataLocalProgramsPythonPython39libsite-packagespandascoreframe.py", line 8195, in merge
return merge(
File "C:UsersAroldoAppDataLocalProgramsPythonPython39libsite-packagespandascorereshapemerge.py", line 74, in merge
op = _MergeOperation(
File "C:UsersAroldoAppDataLocalProgramsPythonPython39libsite-packagespandascorereshapemerge.py", line 668, in __init__
) = self._get_merge_keys()
File "C:UsersAroldoAppDataLocalProgramsPythonPython39libsite-packagespandascorereshapemerge.py", line 1033, in _get_merge_keys
right_keys.append(right._get_label_or_level_values(rk))
File "C:UsersAroldoAppDataLocalProgramsPythonPython39libsite-packagespandascoregeneric.py", line 1684, in _get_label_or_level_values
raise KeyError(key)
KeyError: 'GlobalID'
关于我的数据,我从excel中获取数据,如上所述,但构建一个数据框架,我的输入看起来完全像这样:
import pandas as pd
pluma_espuma_1 = pd.DataFrame({'ObjectID': [3, 4],
'GlobalID': ['a431fd6a-24f6-436e-a3b4-e7d8b44c80a3', 'b5ad25e8-9c99-40fd-b838-4127c6457f59'],
'Data': ['21/04/2021', '01/05/2021'],
'Estacao': [1500, 5500],
'ParentGlobalID': ['29aaebfa-67bb-4395-9d72-5aa19fcda267', 'e610b5e0-bf10-4239-90bb-3d2099e009e0']})
campanha_anexo3_mar_0 = pd.DataFrame({'ObjectID': [2, 3],
'GlobalID': ['29aaebfa-67bb-4395-9d72-5aa19fcda267', 'e610b5e0-bf10-4239-90bb-3d2099e009e0'],
'subproject': ['Marinho - Integrado', 'Dulcicola'],
'Campanha': [22, 23],
'Pernada': [1, 1],
'Creator': ['Ed_tty', 'Haruald']})
你知道什么是正确的方法吗?谢谢
数据样本,图片
您可以随意使用列名以获得所需的方式,但这会合并两个
pluma_espuma_1.merge(campanha_anexo3_mar_0[['GlobalID', 'subproject', 'Campanha', 'Pernada']], left_on='ParentGlobalID', right_on='GlobalID', how='inner')
ObjectID GlobalID_x Data Estacao ParentGlobalID GlobalID_y subproject Campanha Pernada
0 3 a431fd6a-24f6-436e-a3b4-e7d8b44c80a3 21/04/2021 1500 29aaebfa-67bb-4395-9d72-5aa19fcda267 29aaebfa-67bb-4395-9d72-5aa19fcda267 Marinho - Integrado 22 1
1 4 b5ad25e8-9c99-40fd-b838-4127c6457f59 01/05/2021 5500 e610b5e0-bf10-4239-90bb-3d2099e009e0 e610b5e0-bf10-4239-90bb-3d2099e009e0 Dulcicola 23 1