使用iloc后仍然得到SettingWithCopyWarning.它从何而来?



我得到了SettingWithCopyWarning消息。

/usr/地方/lib/python3.6/dist-packages/熊猫/核心/indexing.py: 670:setingwithcopywarning:一个值正试图在一个副本上设置从DataFrame

请参阅文档中的注意事项:https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copyiloc。_setitem_with_indexer (indexer)

我已经试着弄清楚几个小时了,但不能发现问题在哪里。有人能透露一点吗?

这是代码:

df = pd.read_excel(path).assign(rate = np.NaN, alert = np.NaN)
def incidence(data, treshold):
#iterates through each row of dataset 
for i, row in enumerate(df.itertuples()):
try:
#divides total_rdtpost by total_rdtused
data.rate.iloc[i] = row[5]/row[7]
except ZeroDivisionError:
#fixes the ZeroDivisionError, please read https://docs.python.org/3/library/exceptions.html#ZeroDivisionError 
data.rate.iloc[i] = 0.0
#creates a low or high record depending on the treshold variable
if data.rate.iloc[i] > treshold:
data.alert.iloc[i] = "high"
else:
data.alert.iloc[i] = "low"
return data
incidence(df, 0.2)

p。我用Colab。

data.ratedata.alert分别是data['rate']data['alert']的缩写。data['rate']可以被复制,所以data['rate'].iloc[i]仍然是复制。

改成:

data.iloc[i, data.columns.get_loc('rate')] = ...

您可以预先保存列索引:

def incidence(data, treshold):
# at the start of the function
rate_col = data.columns.get_loc('rate')
alert_col = data.columns.get_loc('alert')
...
for ...
# later
data.iloc[i, rate_col] = ...

顺便说一句,for循环引用全局/外部df,而不是传递给函数的data。应该是:

for i, row in enumerate(data.itertuples()):

另一件事,您也可以将函数中的所有步骤作为列操作而不是使用itertuples。无论如何,您都在修改函数中的数据框架。

最新更新