当R中有多个不正确的数据点时,拾取最后一个值



我有一个数据清理问题。数据收集发生了三次,有时数据输入不正确。因此,如果学生的数据被收集了不止一次,那么最后一个数据点需要被复制。

以下是我的数据集:

df <- data.frame(id = c(1,1,1, 2,2,2, 3,3,3, 4),
text = c("female","male","male", "female","female","female", "male","female","female", "male"),
time = c("first","second","third", "first","second","third", "first","second","third", "first"))

> df
id   text   time
1   1 female  first
2   1   male second
3   1   male  third
4   2 female  first
5   2 female second
6   2 female  third
7   3   male  first
8   3 female second
9   3 female  third
10  4   male  first

因此,由于输入错误,第一和第三名学生的性别信息不同。需要最后一次(第三次(点数据复制到其余数据上。

所需输出为

> df1
id   text   time
1   1   male  first
2   1   male second
3   1   male  third
4   2 female  first
5   2 female second
6   2 female  third
7   3 female  first
8   3 female second
9   3 female  third
10  4   male  first

有什么想法吗?谢谢

我们可以使用last返回'text'的最后一个值,该值使recycled更新mutate中的列

library(dplyr)
df <- df %>%
group_by(id) %>%
mutate(text = last(text)) %>% 
ungroup

如果我们想要第二个或第三个值,请使用nth并修改n,使mininum值为2或组大小n()(当每组元素少于2个时(

df %>% 
group_by(id) %>%
mutate(text = nth(text, min(c(2, n()))))

相关内容

  • 没有找到相关文章

最新更新