Pandas-错误:只能比较标记相同的系列对象



我不知道我在以前运行过很多次的代码上出现错误的原因。

为了给你一些内容,我正在使用一个看起来像这样的数据帧:

老板
User_ID 名称
1 Paul John
2 劳拉 玛丽亚
3 Claire John

一个产生错误的简单示例

In [72]: df=pd.DataFrame(data=[1,2,3],index=[4,6,8],columns=['ID'])
In [73]: df
Out[73]: 
ID
4   1
6   2
8   3
In [74]: df1=pd.DataFrame(data=[1,2,3],index=[4,6,8],columns=['ID1'])

列名的差异无关紧要:

In [75]: df1['ID1']==df['ID']
Out[75]: 
4    True
6    True
8    True
dtype: bool

指数差异:

In [76]: df2=pd.DataFrame(data=[1,2,3],index=[4,6,9],columns=['ID1'])
In [77]: df2['ID1']==df['ID']
Traceback (most recent call last):
File "<ipython-input-77-d3d602ee9ac7>", line 1, in <module>
df2['ID1']==df['ID']
File "/usr/local/lib/python3.8/dist-packages/pandas/core/ops/common.py", line 65, in new_method
return method(self, other)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/arraylike.py", line 29, in __eq__
return self._cmp_method(other, operator.eq)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/series.py", line 4973, in _cmp_method
raise ValueError("Can only compare identically-labeled Series objects")
ValueError: Can only compare identically-labeled Series objects

如果你注意回溯,你会发现这是一个pandas错误,而不是numpy错误!

从序列中提取数组会丢弃index,并且只允许按位置进行比较:

In [78]: df2['ID1'].to_numpy()==df['ID'].to_numpy()
Out[78]: array([ True,  True,  True])