对pivot_wider
很新,我在将此表转换为更广泛的格式(下面是可重复的数据)时遇到了麻烦:
ABC
A_gazes_to A_dur B_gazes_to B_dur C_gazes_to C_dur
1 <NA> 1694924 A 672705 A 1214402
2 B 608329 <NA> 1965078 B 178837
3 C 474406 C 126447 <NA> 1384342
这个转换仅仅是第一个开始:
library(tidyverse)
ABC %>%
pivot_longer(cols = c(A_dur, B_dur, C_dur),
names_to = "Speaker",
values_to = "Duration") %>%
arrange(Speaker) %>%
mutate(Speaker = sub("(.).*", "\1", Speaker))
# A tibble: 9 x 5
A_gazes_to B_gazes_to C_gazes_to Speaker Duration
<chr> <chr> <chr> <chr> <int>
1 NA A A A 1694924
2 B NA B A 608329
3 C C NA A 474406
4 NA A A B 672705
5 B NA B B 1965078
6 C C NA B 126447
7 NA A A C 1214402
8 B NA B C 178837
9 C C NA C 1384342
我不知道怎么做的是总结_gazes_to
列,像这样:
预期结果:
# A tibble: 9 x 3
Speaker Duration Gazes_to
<chr> <int> <chr>
1 A 1694924 NA
2 A 608329 B
3 A 474406 C
4 B 672705 A
5 B 1965078 NA
6 B 126447 C
7 C 1214402 A
8 C 178837 B
9 C 1384342 NA
可验证数据:
structure(list(A_gazes_to = c(NA, "B", "C"), A_dur = c(1694924L,
608329L, 474406L), B_gazes_to = c("A", NA, "C"), B_dur = c(672705L,
1965078L, 126447L), C_gazes_to = c("A", "B", NA), C_dur = c(1214402L,
178837L, 1384342L)), class = "data.frame", row.names = c(NA,
3L))
可以使用pivot_longer
和names_pattern
:
tidyr::pivot_longer(df,
cols = everything(),
names_to = c('Speaker', '.value'),
names_pattern = '([A-Z]+)_(.*)')
# Speaker gazes_to dur
# <chr> <chr> <int>
#1 A NA 1694924
#2 B A 672705
#3 C A 1214402
#4 A B 608329
#5 B NA 1965078
#6 C B 178837
#7 A C 474406
#8 B C 126447
#9 C NA 1384342