我对R比较陌生,想要重组我的数据,目前看起来像这样:
sample <- data.frame("num" = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
sample$CustomerID <- c(1, 2, 2, 3, 3, 4, 4, 4, 4, 5)
sample$product <- c("Eggs", "Bread", "Coke", "Coke", "Eggs", "Apples", "Bread", "Cookies", "Coke", "Milk")
sample$quantity <- c(1, 1, 3, 2, 2, 1, 2, 1, 3, 1)
customerID | Product | Quantity | 1 | 鸡蛋 | 1 | 2
---|---|---|
面包 | 1 | |
可口可乐 | 3 | |
3 | 可口可乐 | 2 |
3 | 鸡蛋 | 2 |
使用dplyr
和tidyr
的一个选项可能如下所示:
library(dplyr, warn=FALSE)
library(tidyr)
sample %>%
select(-num) %>%
tidyr::uncount(quantity) %>%
group_by(CustomerID) %>%
mutate(prod_id = row_number()) %>%
ungroup() %>%
pivot_wider(names_from = prod_id, values_from = product, names_prefix = "prod")
#> # A tibble: 5 × 8
#> CustomerID prod1 prod2 prod3 prod4 prod5 prod6 prod7
#> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 Eggs <NA> <NA> <NA> <NA> <NA> <NA>
#> 2 2 Bread Coke Coke Coke <NA> <NA> <NA>
#> 3 3 Coke Coke Eggs Eggs <NA> <NA> <NA>
#> 4 4 Apples Bread Bread Cookies Coke Coke Coke
#> 5 5 Milk <NA> <NA> <NA> <NA> <NA> <NA>