如何在dplyr和R中总结和子集多级分组数据帧



我有以下长格式的数据:

testdf <- tibble(
name = c(rep("john", 4), rep("joe", 2)), 
rep = c(1, 1, 2, 2, 1, 1), 
field = rep(c("pet", "age"), 3), 
value = c("dog", "young", "cat", "old", "fish", "young")
)

对于每个被点名的人(John和Joe(,我想总结一下他们的每一只宠物:
由于某种原因,我似乎无法处理"中重复的事件/宠物;约翰;数据
如果我只为Joe(只有一只宠物(进行筛选,则代码有效。

非常感谢您的帮助。。。

testdf %>%
group_by(name, rep) %>%
#  filter(name == "joe") %>%  # when I filter only for Joe, the code works
summarise(
about = paste0(
"The pet is a: ", .[field == "pet", "value"], " and it is ", .[field == "age", "value"]
)
)
testdf %>%
pivot_wider(id_cols = name:rep,names_from = field) %>% 
mutate(about = paste0("The pet is a: ", pet, " and it is ", age))
name    rep pet   age   about                             
<chr> <dbl> <chr> <chr> <chr>                             
1 john      1 dog   young The pet is a: dog and it is young 
2 john      2 cat   old   The pet is a: cat and it is old   
3 joe       1 fish  young The pet is a: fish and it is young

这也可以通过data.table来完成,如下所示:

library(data.table)
setDT(testdf)[
,j = .(about = paste0("The pet is a ", .SD[field=="pet",value], " and it is ", .SD[field=="age",value])),
by = .(name,rep)
]
name rep                             about
1: john   1  The pet is a dog and it is young
2: john   2    The pet is a cat and it is old
3:  joe   1 The pet is a fish and it is young

您的数据是长格式的,而且不整齐,一个字段中有多个字段。所以,郎堂的回答是:"扩大"还是"扩大"。(更好的是使用data.table,但我发现它仍然很难使用。SD]

我更喜欢在dplyr中尽可能简单地做这些事情。一种不扩散的替代方案如下,产生相同的结果。[没有数据。表在哪里。SD对我来说仍然很难掌握!所以在3行中:

testdf%>%
group_by(name,rep)%>%    
summarise(about = paste("The pet is ",value[field=='pet']," and it is ",value[field=='age']))

收益率:

name    rep about                             
<chr> <dbl> <chr>                             
1 joe       1 The pet is  fish  and it is  young
2 john      1 The pet is  dog  and it is  young 
3 john      2 The pet is  cat  and it is  old