基于R (plyr包?)中相邻列求和的条件子集



我正在寻找一种更有效的方法来创建r中的子集。使用行=产品和列=时间的数据集,我想找到那些行(产品),其中一个项目在第1周开始销售,然后使其成为一个子集。然后在第二周做同样的事情,以此类推。

set.seed(4); d <- data.frame(
 product = seq(1:10),
 week1= sample(0:1,10,replace=TRUE), 
 week2= sample(0:3,10,replace=TRUE),
 week3=sample(0:5,10,replace=TRUE), 
 week4= sample(0:5,10,replace=TRUE),speed=sample(100:200,10),quality=sample(20:50,10)
)

完整的数据帧是d,所以我需要知道两件事来找到所有的子集:1)前几周的销售额==0,然后2)本周的销售额不是零。

没有子集应该重叠,因为它们根据产品首次进入市场的时间对其进行分组。

我找到了一个穷人的方法来做这件事,但我知道一定有更好的方法!

低效率的方法:

subset3<-d[d$week3 >0 & d$week2==0 & d$week1==0 ,]
subset4<-d[d$week4 >0 & d$week3 ==0 & d$week2==0 & d$week1==0,]

效率略高,但仍然很差

subset3<-d[d$week3 >0 & d$week2+d$week1==0 ,]
subset4<-d[d$week4 >0 & d$week3 + d$week2 + d$week1==0,]

感觉我应该能够做这样的事情,但它不起作用:

subset4<-d[d$week4 >0 & sum(d$week1:d$week3) ==0, ]
我不认为ddply或apply会在这里工作,但也许我错了?我需要的结果是d的子集,所有列,像这样:

subset3 =

product week1 week2 week3 week4 speed quality
   2     0     0     5     1   124      42
   3     0     0     3     5   155      45

你可以这样写:

d$weekstart <- apply(d[,-1],1,function(x) which(x>0)[1] )

这将确定每个产品的第一个非零销售周。然后,您可以使用此列拆分数据集,如下所示:

result <- split(d,d$weekstart)

您可以像这样访问每个子集:

result[[1]]

将上述代码中的1更改为您想要访问的起始周类似于将subset1更改为subset2

我希望我明白你想做什么。这里尝试使用rle函数。我对每一行应用它。(每个产品)。

ll <- apply(d,1,function(x){
  y <- rle(x)
  nn <- names(y$lengths[y$values ==0])
  vv <- y$lengths[y$values ==0]
  if(length(nn)==0)
    res <- data.frame(nbr=0,goodweek='week1')
  else
   res <- data.frame(nbr=vv,goodweek=nn)
})

do.call(rbind,ll)
       nbr goodweek
week3    2    week3  ## 2 bad weeks with 0 then week3 is good 0 0 value>0
week31   2    week3
3        0    week1
week4    1    week4
week2    1    week2
6        0    week1 ## all weeks are good
week41   1    week4
8        1          ## the last week is bad! I dont' know what to return here!
9        0    week1
week21   1    week2

这里我用你的d:

d
   week1 week2 week3 week4
1      0     0     5     2
2      0     0     1     3
3      1     2     3     2
4      1     1     0     1
5      0     3     1     4
6      1     1     2     4
7      1     2     0     4
8      1     3     2     0
9      1     1     5     4
10     0     3     2     2

相关内容

  • 没有找到相关文章

最新更新