我有多个产品的数据,它们的发布日期和销售情况;还有其他变量。但是,这些是我用来操纵的。我想在我的数据集中添加一列和一行;新的变量/列表示产品发布(launch)后的月份。因此,launch 1将表示每个产品的第一个月,而launch 2将表示第二个月,以此类推。我还想为每个产品添加一个观察值(行),其中launch为0,sales为0。
months<- as.Date(c("2011-04-01", "2011-05-01" , "2011-06-01",
"2012-10-01", "2012-11-01", "2012-12-01",
"2011-04-01", "2011-05-01" , "2011-06-01",
"2013-06-01", "2013-07-01", "2013-08-01"))
product <- c("A", "A" , "A",
"B", "B", "B",
"C", "C" , "C",
"D", "D", "D")
sales<- c(75, 78,80,
67, 65, 75,
86, 87, 87,
90, 92, 94)
#This is how data looks right now..
input_data<- data.frame(months, product, sales)
现在,我可以添加启动列,并将启动值与group_ by product后面的row_number相同,并根据月份将启动填充为1、2、3等。但是,我不知道如何为每个产品添加额外的观察。现在,我正在确定每个产品的进入日期,并创建一个包含0发布和销售的数据框架,并绑定数据集。但是,它是乏味的,我相信它可以做得更有效率。
#Expected outcome:
#I don't care about the additional dates row too much it can remain as NA, here I added it for making data frame
months1 <- as.Date(c ("2011-03-01", "2011-04-01", "2011-05-01" , "2011-06-01",
"2012-9-01", "2012-10-01", "2012-11-01", "2012-12-01" ,
" 2011-03-01", "2011-04-01", "2011-05-01" , "2011-06-01" ,
"2013-06-01", "2013-06-01", "2013-07-01", "2013-08-01"))
launch<- c(0, 1, 2, 3,
0, 1, 2, 3,
0, 1, 2, 3,
0, 1, 2, 3)
product1 <- c("A", "A" , "A", "A",
"B", "B", "B", "B",
"C", "C" , "C", "C",
"D", "D", "D", "D")
sales1<- c(0, 75, 78,80,
0, 67, 65, 75,
0, 86, 87, 87,
0, 90, 92, 94)
output_data <- data.frame (months1, launch, product1, sales1)
我们可以用complete
来展开按'product'分组后的数据
library(lubridate)
library(dplyr)
library(tidyr)
input_data %>%
group_by(product) %>%
complete(months = first(months) %m+% months(-1:2),
fill = list(sales = 0)) %>%
mutate(launch = row_number() - 1) %>%
ungroup %>%
select(months, launch, product, sales)
与产出
# A tibble: 16 × 4
months launch product sales
<date> <dbl> <chr> <dbl>
1 2011-03-01 0 A 0
2 2011-04-01 1 A 75
3 2011-05-01 2 A 78
4 2011-06-01 3 A 80
5 2012-09-01 0 B 0
6 2012-10-01 1 B 67
7 2012-11-01 2 B 65
8 2012-12-01 3 B 75
9 2011-03-01 0 C 0
10 2011-04-01 1 C 86
11 2011-05-01 2 C 87
12 2011-06-01 3 C 87
13 2013-05-01 0 D 0
14 2013-06-01 1 D 90
15 2013-07-01 2 D 92
16 2013-08-01 3 D 94