我有一个数据。
df <- data.table(Date = c(seq.Date(from = as.Date('2022-01-01'),to = as.Date('2022-01-07'),by=1),
seq.Date(from = as.Date('2022-01-01'),to = as.Date('2022-01-07'),by=1)),
Product = c(rep('A',7),rep('B',7)),
Owner = c(c('X','X','Y','Y','Z','Z','Z'),c('M','M','M','M','N','O','O')))
Product
是我在这里的组值,我想创建一个列,显示当前行之前的产品所有者。
我的意思;
Date Product Owner BeforeOwnerCount
<date> <chr> <chr> <dbl>
1 2022-01-01 A X 0
2 2022-01-02 A X 0
3 2022-01-03 A Y 1
4 2022-01-04 A Y 1
5 2022-01-05 A Z 2
6 2022-01-06 A Z 2
7 2022-01-07 A Z 2
8 2022-01-01 B M 0
9 2022-01-02 B M 0
10 2022-01-03 B M 0
11 2022-01-04 B M 0
12 2022-01-05 B N 1
13 2022-01-06 B O 2
14 2022-01-07 B O 2
也欢迎dplyr
动词。
提前感谢。
假设日期列按时间顺序排列。(如果没有,先按日期键)
df[, BOC := rleid(Owner) - 1, by = Product]
Date Product Owner BOC
1: 2022-01-01 A X 0
2: 2022-01-02 A X 0
3: 2022-01-03 A Y 1
4: 2022-01-04 A Y 1
5: 2022-01-05 A Z 2
6: 2022-01-06 A Z 2
7: 2022-01-07 A Z 2
8: 2022-01-01 B M 0
9: 2022-01-02 B M 0
10: 2022-01-03 B M 0
11: 2022-01-04 B M 0
12: 2022-01-05 B N 1
13: 2022-01-06 B O 2
14: 2022-01-07 B O 2
使用dplyr
和factor
:
library(dplyr)
library(data.table)
setDF(df) %>%
group_by(Product) %>%
mutate(BeforeOwnerCount = as.numeric(as.factor(Owner))-1)
输出:
# A tibble: 14 × 4
# Groups: Product [2]
Date Product Owner BeforeOwnerCount
<date> <chr> <chr> <dbl>
1 2022-01-01 A X 0
2 2022-01-02 A X 0
3 2022-01-03 A Y 1
4 2022-01-04 A Y 1
5 2022-01-05 A Z 2
6 2022-01-06 A Z 2
7 2022-01-07 A Z 2
8 2022-01-01 B M 0
9 2022-01-02 B M 0
10 2022-01-03 B M 0
11 2022-01-04 B M 0
12 2022-01-05 B N 1
13 2022-01-06 B O 2
14 2022-01-07 B O 2