我知道不应该在pandas数据框架的视图上设置值,我没有这样做,但我得到这个错误。我有一个这样的函数:
def do_something(df):
# id(df) is xxx240
idx = get_skip_idx(df) # another function that returns a boolean series
if any(idx):
df = df[~idx]
# id(df) is xxx744, df is now a local variable which is a copy of the input argument
assert not df._is_view # This doesn't fail, I'm not having a view
df['date_fixed'] = pd.to_datetime(df['old_date'].str[:10], format='%Y-%m-%d')
# I'm getting the warning here which doesn't make any sense to me
我使用的是pandas 1.4.1。对我来说,这听起来像是一个bug,我想在提交罚单之前确认我没有错过任何东西。
我的理解是_is_view
可以返回假阴性,并且您实际上正在处理原始数据框架的视图。
一种解决方法是将df[~idx]
替换为df[~idx].copy()
:
import pandas as pd
df = pd.DataFrame(
{
"value": [1, 2, 3],
"old_date": ["2022-04-20 abcd", "2022-04-21 efgh", "2022-04-22 ijkl"],
}
)
def do_something(df, idx):
if any(idx):
df = df[~idx].copy()
df["date_fixed"] = pd.to_datetime(df["old_date"].str[:10], format="%Y-%m-%d")
return df
print(do_something(df, pd.Series({0: True, 1: False, 2: False})))
# No warning
value old_date date_fixed
1 2 2022-04-21 efgh 2022-04-21
2 3 2022-04-22 ijkl 2022-04-22