有条件地添加几个新列的更好方法,这些新列中填充的条目依赖于其他列的条目



我有以下数据框架:

CustomerID  Department  Price  SportswearDemand  HomeDemand  KidswearDemand  WomenswearDemand
-------------------------------------------------------------------------------------------
1050091     Sportswear  497.6  0                 0           0               0                          
1555018     Womenswear  336.0  0                 0           0               0                       
210239      Womenswear  698.0  0                 0           0               0                       
507556      Sportswear  209.0  0                 0           0               0                        
1708193     Sportswear  209.0  0                 0           0               0                        
1295733     Menswear    209.0  0                 0           0               0                        
1213373     Sportswear  298.0  0                 0           0               0                       
753471      Sportswear  209.0  0                 0           0               0                        
82739       Menswear    349.0  0                 0           0               0                        
1660995     Kidswear    424.6  0                 0           0               0
.
.
.                

所有的列,包括SportswearDemand的右侧,我将从现在开始称为"需求列"。我想根据DepartmentPrice的信息填充这些,如下所示:

如果,对于某个customerID, Department包含条目Sportswear,那么我希望将该行的价格输入SportswearDemand,而不是当前的零。其他需求列也是一样。最终的结果应该是这样的:

CustomerID  Department  Price  SportswearDemand  HomeDemand  KidswearDemand  WomenswearDemand
-------------------------------------------------------------------------------------------
1050091    Sportswear   497.6  497.6             0           0               0
1555018    Womenswear   336.0  0                 0           0               336.0
210239     Womenswear   698.0  0                 0           0               698.0
507556     Sportswear   209.0  209.0             0           0               0
1708193    Sportswear   209.0  209.0             0           0               0
1295733    Menswear     209.0  0                 0           0               0
1213373    Sportswear   298.0  298.0             0           0               0
753471     Sportswear   209.0  209.0             0           0               0
82739      Menswear     349.0  0                 0           0               0
1660995    Kidswear     424.6  0                 0           424.6           0
.
.
.

我已经设法这样解决了:

df$SportswearDemand <- with(df, ifelse(df$Department == "Sportswear", df$Price, 0))
df$HomeDemand <- with(df, ifelse(df$Department == "Home", df$Price, 0))
df$KidswearDemand <- with(df, ifelse(df$Department == "Kidswear", df$Price, 0))
df$WomenswearDemand <- with(df, ifelse(df$Department == "Womenswear", df$Price, 0))

然而,我有30多个这样的需求列,我想知道是否有比硬编码这样的30行更好的方法来做到这一点?

我的第一个想法是像这样将一行封装在for循环中:

DemandColumns # array of all the 30 different demand columns stored as strings
for (i in DemandColumns){
df$i <- with(df, ifelse(df$Department == substr(i,1,nchar(i)-6), df$Price, 0))
}

但它只是添加了一个充满零的列"i"substr用于获取字符串"Demand"以外的所有字符。如有任何帮助,不胜感激。

无需初始化"需求列",请先删除。

df[grep('Demand', names(df))] <- NULL

创建PriceDepartment列的副本,并以宽格式获取数据。

library(dplyr)
library(tidyr)
df %>%
mutate(value = Price, 
name = Department) %>%
pivot_wider(names_from = name, values_from = value, 
names_glue = '{name}_Demand', values_fill = 0)
#   CustomerID Department Price Sportswear_Demand Womenswear_Demand Menswear_Demand Kidswear_Demand
#        <int> <chr>      <dbl>             <dbl>             <dbl>           <dbl>           <dbl>
# 1    1050091 Sportswear  498.              498.                 0               0              0 
# 2    1555018 Womenswear  336                 0                336               0              0 
# 3     210239 Womenswear  698                 0                698               0              0 
# 4     507556 Sportswear  209               209                  0               0              0 
# 5    1708193 Sportswear  209               209                  0               0              0 
# 6    1295733 Menswear    209                 0                  0             209              0 
# 7    1213373 Sportswear  298               298                  0               0              0 
# 8     753471 Sportswear  209               209                  0               0              0 
# 9      82739 Menswear    349                 0                  0             349              0 
#10    1660995 Kidswear    425.                0                  0               0            425.

相关内容

  • 没有找到相关文章

最新更新