什么是'fillna()'对于dtype'Int32'

简短问题：如何设置<1或<NA>为1？

长问题：假设我有一个纯int(int32！(pandas列，我过去可以这样做来限制最小值：

>>> shots = pd.DataFrame([2, 0, 1], index=['foo', 'bar', 'baz'], columns={'shots'}, dtype='int32')
shots
shots
foo      2
bar      0
baz      1
>>> max(shots.loc['foo', 'shots'], 1)
2
>>> max(shots.loc['bar', 'shots'], 1)
1

到目前为止，一切都很好。现在，假设列shots的dtype从"int32"更改为Int32，从而允许<NA>。这让我在访问<NA>记录时遇到了麻烦。我得到这个错误：

>>> shots = pd.DataFrame([2, np.nan, 1], index=['foo', 'bar', 'baz'], columns={'shots'}, dtype='Int32')
shots
foo      2
bar   <NA>
baz      1
>>> max(shots.loc['bar', 'shots'], 1)    
`TypeError: boolean value of NA is ambiguous`

我该怎么办？

我的第一直觉是说"；好的，让我们填充值，然后应用max((&"；。但这也失败了：

>>> shots.loc[idx, 'shots'].fillna(1)
AttributeError: 'NAType' object has no attribute 'fillna'

-->将条件应用于<NA>值的最泛/并行方式是什么，即将所有<NA>设置为1，或应用其他形式的基本匹配，如max(<NA>, 1)？

版本

Python 3.8.6
熊猫1.2.3
编号1.19.2

idx应该是一个集合，否则如果它是标量，则会得到标量值：

# idx = 'bar'
>>> shots.loc[idx, 'shots']
<NA>
>>> shots.loc[idx, 'shots'].fillna(1)
...
AttributeError: 'NAType' object has no attribute 'fillna'
>>> shots.loc[[idx], 'shots'].fillna(1)
bar    1
Name: shots, dtype: Int32

问题是idx是如何定义的？

旧答案

你的问题对我来说无法重现。

shots = pd.DataFrame({'shots': [2, 1, pd.NA]}, dtype=pd.Int32Dtype())
idx = [2]
>>> shots
shots
0      2
1      1
2   <NA>
>>> shots.dtypes
shots    Int32
dtype: object
>>> shots.loc[idx, 'shots'].fillna(1)
2    1
Name: shots, dtype: Int32

版本：

Python 3.9.7
大熊猫1.4.1
数字1.21.5

相关内容

最新更新

热门标签：