df2.loc[(df2[‘feature’]=0),‘package_loss’]=1
我的代码在上面。在这里,如果另一列等于0,我将尝试将"package_loss"列的值更改为1。
使用dask.dataframe.DataFrame.where
:
df2['package_loss'].where((df2['feature'] == 0), df2['package_loss'], 1).compute()
这不像@jezrael的答案那么简洁,但允许使用pandas
语法进行更灵活的转换:
from dask.datasets import timeseries
def add_col(df):
df = df.copy()
mask = df["name"] == "Dan"
df["new_column"] = 0
df.loc[mask, "new_column"] = 1
return df
df = timeseries()
df2=df.map_partitions(add_col)
df2.head()