用两种不同的收费标准合并成两列,熊猫



我的输入:

df1=pd.DataFrame(
{
'A':['my ','fire','water','earth','monkey'],
'B':[1,5,7,8,9],
'C':[100,105,110,182,140]
})
print(df1)
A  B    C
0     my   1  100
1    fire  5  105
2   water  7  110
3   earth  8  182
4  monkey  9  140
df2=pd.DataFrame(
{
'A':['drop','hold','push','pull','keep'],
'B':[1,4,4,10,10],
'C':[103,102,133,124,142]
})
print(df2)
A   B    C
0  drop   1  103
1  hold   4  102
2  push   4  133
3  pull  10  124
4  keep  10  142

我想使用pd.merge_asof()或任何其他方式的合并这两个df(df1&df2(

我可以通过以下方式使用一个收费标准合并这两列:df= pd.merge_asof(df1,df2,on='B',direction='nearest',tolerance=2)

但我需要使用B列和C列的两个直接公差:B_ tol=2,C_tol=4

预期输出:

A_x  B_x C_x   A_y  B_y    C_y  
0     my   1  100   drop   1    103 
1    fire  5  105   hold   4    102
2  monkey  9  140   keep  10    142

当我在这些列上使用merge时,预期的输出将只有一个列B和C,这个输出只显示了它应该如何工作的示例

我们可以讨论这个解决方案吗?我不想把它粘贴到评论

df1=pd.DataFrame(
{
'A':['my ','fire','water','earth','monkey'],
'B':[1,5,7,8,9],
'C':[100,105,110,182,140]
})
df2=pd.DataFrame(
{
'A':['drop','hold','push','pull','keep'],
'B':[1,4,4,10,10],
'C':[103,102,133,124,142]
})
df_1 = pd.merge_asof(df1,df2,on='B',direction='nearest',tolerance=2)
df_2 = pd.merge_asof(df1.sort_values(by='C'),df2.sort_values(by='C'),on='C',direction='nearest',tolerance=4)
df_3 = pd.merge_asof(df2,df1,on='B',direction='nearest',tolerance=2)
df_4 = pd.merge_asof(df2.sort_values(by='C'),df1.sort_values(by='C'),on='C',direction='nearest',tolerance=4)
# df= pd.merge_asof(df3.sort_values(by='C_x'),df4.sort_values(by='C'),left_on='C_x',right_on='C',direction='nearest',tolerance=4).dropna()
df_12 = pd.merge(df_1,df_2,on='A_x').dropna()
df_34 = pd.merge(df_3,df_4,on='A_x').dropna()
print(df_12)
A_x  B  C_x A_y_x    C_y  B_x    C A_y_y   B_y
0     my   1  100  drop  103.0    1  100  hold   4.0
1    fire  5  105  push  133.0    5  105  drop   1.0
4  monkey  9  140  pull  124.0    9  140  keep  10.0
print(df_34)
A_x   B  C_x   A_y_x  C_y  B_x    C   A_y_y  B_y
0  drop   1  103     my   100    1  103    fire  5.0
1  hold   4  102    fire  105    4  102     my   1.0
4  keep  10  142  monkey  140   10  142  monkey  9.0
df = pd.merge(df_12,df_34,left_index=True,right_index=True)
print(df[['A_x_x','A_x_y']])
A_x_x A_x_y
0     my   drop
1    fire  hold
4  monkey  keep

最新更新