我得到 ValueError:int() 的文字无效，以 10 为基数，带有 np.where 函数

我想将 df 列中的"不可用"值更改为 0，并将其余值更改为整数。

列中的唯一值为：

['30', 'not available', '45', '60', '40', '90', '21', '5','75','29', '8', '10']

我运行以下代码将值更改为整数：

df[col] = np.where(df[col] == 'not available',0,df[col].astype(int))

我希望上述内容将所有值转换为整数，但我得到值错误

ValueError: invalid literal for int() with base 10: 'not available'

任何建议为什么代码不起作用？

改用to_numeric试试：

df[col] = pd.to_numeric(df[col], errors="coerce").fillna(0)
>>> df[col]
0     30.0
1      0.0
2     45.0
3     60.0
4     40.0
5     90.0
6     21.0
7      5.0
8     75.0
9     29.0
10     8.0
11    10.0

或者仅将"不可用"转换为 0 并将其他字符串转换为NaN：

df[col] = pd.to_numeric(df[col].replace("not available", 0), errors="coerce")

在做之前

df[col] = np.where(df[col] == 'not available',0,df[col].astype(int))

有必要计算

df[col] == 'not available'
0
df[col].astype(int)

后一种含义是所有失败的版本int，因为not available作为整数没有意义，您可以通过将pandas.Series.apply与lambda保持三元运算符结合使用来避免此问题，如下所示

import pandas as pd
df = pd.DataFrame({"col1":['30', 'not available', '45', '60', '40', '90', '21', '5','75','29', '8', '10']})
col = "col1"
df[col] = df[col].apply(lambda x:0 if x=='not available' else int(x))
print(df)

输出

这样int仅应用于不等于'not available'的记录

相关内容

最新更新

热门标签：