我不知道我在以前运行过很多次的代码上出现错误的原因。
为了给你一些内容,我正在使用一个看起来像这样的数据帧:
User_ID | 名称 | 老板|
---|---|---|
1 | Paul | John |
2 | 劳拉 | 玛丽亚 |
3 | Claire | John |
一个产生错误的简单示例
In [72]: df=pd.DataFrame(data=[1,2,3],index=[4,6,8],columns=['ID'])
In [73]: df
Out[73]:
ID
4 1
6 2
8 3
In [74]: df1=pd.DataFrame(data=[1,2,3],index=[4,6,8],columns=['ID1'])
列名的差异无关紧要:
In [75]: df1['ID1']==df['ID']
Out[75]:
4 True
6 True
8 True
dtype: bool
指数差异:
In [76]: df2=pd.DataFrame(data=[1,2,3],index=[4,6,9],columns=['ID1'])
In [77]: df2['ID1']==df['ID']
Traceback (most recent call last):
File "<ipython-input-77-d3d602ee9ac7>", line 1, in <module>
df2['ID1']==df['ID']
File "/usr/local/lib/python3.8/dist-packages/pandas/core/ops/common.py", line 65, in new_method
return method(self, other)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/arraylike.py", line 29, in __eq__
return self._cmp_method(other, operator.eq)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/series.py", line 4973, in _cmp_method
raise ValueError("Can only compare identically-labeled Series objects")
ValueError: Can only compare identically-labeled Series objects
如果你注意回溯,你会发现这是一个pandas
错误,而不是numpy
错误!
从序列中提取数组会丢弃index
,并且只允许按位置进行比较:
In [78]: df2['ID1'].to_numpy()==df['ID'].to_numpy()
Out[78]: array([ True, True, True])