为什么合并2个数据帧给我一个三倍的行



我有df1:

x            y no.
0  -17.7    -0.785430  y1
1  -15.0 -3820.085000  y4
2  -12.5     2.138833  y3
..  ....     ........  ..
40  15.6     5.486901  y2
41  19.2     1.980686  y3
42  19.6     9.364718  y2

df2:

delta y     x
0     0.053884 -17.7
1     0.085000 -15.0
2     0.143237 -12.5
..    ........  ....
40    0.113099  15.6
41    0.102245  19.2
42    0.235282  19.6

它们都有43行,并且x列在两者上完全相同。

当我在x上合并它们时,我得到了一个123行的df:

x            y no.   delta y
0   -17.7    -0.785430  y1  0.053884
1   -15.0 -3820.085000  y4  0.085000
2   -12.5     2.138833  y3  0.143237
3   -12.4     1.721205  y3  0.251180
4   -12.1     2.227343  y2  0.127343
..    ...          ...  ..       ...
118  12.1     1.642526  y3  0.143886
119  14.4  2576.435000  y4  0.171000
120  15.6     5.486901  y2  0.113099
121  19.2     1.980686  y3  0.102245
122  19.6     9.364718  y2  0.235282

输入:final = df1.merge(df2, on="x")x float64y float64不。对象dtype:对象

δ y float64x float64dtype:对象

x float64y float64不。对象dtype:对象

δ y float64x float64dtype:对象

x float64y float64不。对象dtype:对象

δ y float64x float64dtype:对象

df1 = pd。DataFrame({"x":{0:-17.7,1:-15.0,2:-12.5,3:-12.4,4:-12.1,5:-11.2,6:-8.9,7:-7.5,8:-7.5,9:-6.0,10:-6.0,11:-4.7,12:-4.1,13:-3.8,14:-3.4,15:-3.4,16:-1.9,17:-1.5,18:-1.1,19:-0.4,20:-0.1,21:3.5,22日:3.8,23日:5.3,24:5.3,25日:5.3,26日:5.3,27日:5.3,28日:5.3,29日:5.3,30日:5.3,31日:5.3,32:6.4,33:6.8,34:6.8,35:10.2,36:10.3,37:11.9,38:12.1,39:14.4,40:15.6,41:19.2,42:19.6},"y":{0:-0.7854295,1:-3820.085,2:2.1388333,3:1.7212046,4:2.227343、5:0.04315967、6:-0.9616607、7:-1.9878536、8:-0.52237016、9:-283.27216、10:-282.5332、11:-0.4335017、12:-1.1585577、13:-0.008831219、14:848.92303、15:-57.407845、16:-9.010686、17:-3.2473037、18:0.5536767、19:1.8351307、20:4.8347697、21:-6.45842、22:- 0.9338831、25:97.65833、26:1.6500127、27:1.6500127、28:97.65833、29:97.65833、30:1.65500127、31:1.9058462、34:227.5592、35:857.7455、36:-0.68584794、37:1.6785516, 38: 1.6425261, 39: 2576.435, 40: 5.4869013, 41: 1.9806856, 42: 9.364718},":{0:"日元",1:"y4",2:y3, 3: y3, 4: y2, 5: y3, 6: y2, 7: y2, 8: y2, 9:"y4",10:"y4",11:"日元",12:y3, 13:"日元",14:"y4",15:"y4",16:y4, 17: y4, 18:"日元",19:y3, 20: y4, 21: y2, 22: y3, 23: y3, 24: y3, 25日:"y4",26日:y3, 27日:y3, 28日:"y4",29日:y3, 30日:‘y4’,31日:"y4",32:y2, 33: y3, 34: y3, 35:"y4",36:y3, 37: y3, 38: y3, 39: y4, 40: y2, 41: y3, 42:‘y2}})

df2 = pd。DataFrame({'delta y': {0: 0.05388353000000001, 1: 0.08500000000003638, 2: 0.1432367999999999994, 3: 0.251179999999999999996, 4: 0.12734299999999976, 5: 0.36285006000000003, 6: 0.13833930000000005, 7: 0.512141464, 8: 1.9776299999884, 9: 0.272159999999999853, 10: 0.4667999999999779, 11: 0.2692114, 12: 0.00890970000000002, 13: 0.314458351, 14: 906.34703, 15: 0.016154999999999999777, 16: 0.3723036999999998, 18: 0.2988478, 19: 0.006991300000000145, 20: 0.14423030000000026, 21:0.0415799999999999973, 22: 0.013554200000000183, 23: 0.174865600000000183, 23: 0.1748652000000000007, 24: 0.17486560000000007, 25: 0.038669999999999999621, 26: 0.541264, 27: 0.541264, 28: 0.0386699999999999621, 29: 96.5495813, 30: 96.0469873, 31: 0.0386699999999999621, 32: 0.05542200000000008, 33: 0.1670513, 34: 225.82040510000002, 35: 0.38250000000005, 36: 0.9580486, 37: 0.10641100000000002, 38: 0.14388610000000002, 39: 0.17099999999999992174, 40: 0.113098699999999999999922, 41: 0.1022448999999999999977}, 'x': {0};-17.7, 1: -15.0, 2: -12.5, 3: -12.4, 4: -12.1, 5: -11.2, 6: -8.9, 7: -7.5, 8: -7.5, 9: -6.0, 10: -6.0, 11: -4.7, 12: -4.1, 13: -3.8, 14: -3.4, 15: -3.4, 16: -1.9, 17: -1.5, 18: -1.1, 19: -0.4, 20: -0.1, 21: 3.5, 22日:3.8,23日:5.3,24:5.3,25日:5.3,26日:5.3,27日:5.3,28日:5.3,29日:5.3,30日:5.3,31日:5.3,32:6.4,33:6.8,34:6.8,35:10.2,36:10.3,37:11.9,38:12.1,39:14.4,40:15.6,41:19.2,42:19.6}})

final = df1.merge(df2, on="x")

问题是x值不是唯一的,因此合并重复行以获得所有组合。在一个简单的例子中

>>> import pandas as pd
>>> df1=pd.DataFrame({"a":[1,2,3,2], "b":['a', 'b', 'c', 'd']})
>>> df2=pd.DataFrame({"a":[1,2,3,2], "c":['aa', 'bb', 'cc', 'dd']})
>>> df1.merge(df2, on='a')
a  b   c
0  1  a  aa
1  2  b  bb
2  2  b  dd
3  2  d  bb
4  2  d  dd
5  3  c  cc

2在列中不是唯一的,并且得到所有的组合(注意b——>d和d——>dd)。

在您的示例中,x列在两个数据框架中是相同的。这也意味着索引没有改变,您可以将您想要的列分配给df1

df1["delta y"] = df2["delta y"]

尝试如下:df1.join(df2)

join是一个列向左连接

pd。Merge是按列的内部连接

pd。Concat是逐行外连接

pd.concat:接受Iterable参数。因此,它不能直接接受dataframe(使用[df,df2])DataFrame的尺寸应该沿着轴

匹配Join and pd.merge:可以接受DataFrame参数

ref:合并两个数据帧

尝试以下语法,我建议您仔细阅读python的官方文档,链接在底部。我认为df1和df2中的x值可能不同它们不是100%相同的。这可能是由于小数的原因。

import pandas as pd
left = pd.DataFrame(
{
"key": ["K0", "K1", "K2", "K3"],
"A": ["A0", "A1", "A2", "A3"],
"B": ["B0", "B1", "B2", "B3"],
}
)

right = pd.DataFrame(
{
"key": ["K0", "K1", "K2", "K3"],
"C": ["C0", "C1", "C2", "C3"],
"D": ["D0", "D1", "D2", "D3"],
}
)

result = pd.merge(left, right, on="key")

结果图像Python合并,连接,连接官方指南

最新更新