将列转换为r中有条件的行

我试图通过考虑不同的值将两列合并为一列，并将它们放入另一行，如果有的话。这是我的数据集的样子。

df <- data.frame(
id = c(1,2,3),
role = c("A","B","C"),
grade.1 = c(3,4,5),
state.1 = c(1,NA,1),
grade.2 = c(4,4,5),
state.2 = c(1,1,NA),
grade.3 = c(3,4,5),
state.3 = c(1,1,NA))
> df
id role grade.1 state.1 grade.2 state.2 grade.3 state.3
1  1    A       3       1       4       1       3       1
2  2    B       4      NA       4       1       4       1
3  3    C       5       1       5      NA       5      NA

我需要把这些grade.1,grade.2和grade.3列合并成一个Grade列。我尝试了coalesce，但它失去了id=1信息，因为它在grade.列中有两个等级。此外，state.映射也不能很好地工作。

df <- df %>%
mutate(Grade = coalesce(grade.1, grade.2, grade.3))
> df
id role grade.1 state.1 grade.2 state.2 grade.3 state.3 Grade
1  1    A       3       1       4       1       3       1     3
2  2    B       4      NA       4       1       4       1     4
3  3    C       5       1       5      NA       5      NA     5

我想要的是为id=1添加另一行，并通过添加二级来放置第二行。我期望的数据集是:

> df.2
id role Grade state.1 state.2 state.3
1  1    A     3       1      NA       1
2  1    A     4      NA       1      NA
3  2    B     4      NA       1       1
4  3    C     5       1      NA      NA

因此，当一个id有多个等级时，需要将其放在不同的行中，并且state.映射应该基于该等级。什么好主意吗?谢谢!

有一种方法-

获取长格式数据
对于每个id，将NA添加到state中除当前列号外的每个值。
对于grade列中的每一个唯一值，获取非na值。

library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = starts_with('grade'), 
values_to = 'grade', names_to = NULL) %>%
group_by(id) %>%
mutate(across(starts_with('state'), 
~replace(., -as.numeric(sub('state.', '', cur_column(), fixed = TRUE)), NA))) %>%
group_by(id, role, grade) %>%
summarise(across(starts_with('state'), ~.x[!is.na(.x)][1]), .groups = 'drop')
#     id role  grade state.1 state.2 state.3
#  <dbl> <chr> <dbl>   <dbl>   <dbl>   <dbl>
#1     1 A         3       1      NA       1
#2     1 A         4      NA       1      NA
#3     2 B         4      NA       1       1
#4     3 C         5       1      NA      NA

我不确定这是否是您正在寻找的。或者这是最好的方法。但这是我到目前为止的内容:

# read them in as 3 tables
df1 <- df[,.(id, role, grade = grade.1, state.1)]
df2 <- df[,.(id, role, grade = grade.2, state.2)]
df3 <- df[,.(id, role, grade = grade.3, state.3)]
# set the keys to do joins
setkey(df1, id, role, grade)
setkey(df2, id, role, grade)
setkey(df3, id, role, grade)
df_res <- rbind(
df1[df2[df3]],
df1[df3[df2]],
df2[df3[df1]],
df2[df1[df3]],
df3[df1[df2]],
df3[df2[df1]],
fill = T
)
unique(df_res)[order(id)]
>    id role grade state.1 state.2 state.3
1:  1    A     3       1      NA       1
2:  1    A     4      NA       1      NA
3:  2    B     4      NA       1       1
4:  3    C     5       1      NA      NA

相关内容

最新更新

热门标签：