r-从一元数据创建二元(关系)数据



我有冲突数据,看起来像这个

conflict_ID country_code   SideA
1              1             1          
1              2             1 
1              3             0
2              4             1
2              5             0 

现在我想把它变成这样的二元冲突数据(SideA=1应该是country_code_1(:

conflict_ID country_code_1 country_code_2 
1              1             3          
1              2             3 
2              4             5

有人能给我指正确的方向吗?

这里有一个直接的方法:

df %>%
filter(SideA == 1) %>%
select(conflict_ID, country_code_1 = country_code) %>%
left_join(
df %>%
filter(SideA == 0) %>%
select(conflict_ID, country_code_2 = country_code),
by = "conflict_ID"
)
#   conflict_ID country_code_1 country_code_2
# 1           1              1              3
# 2           1              2              3
# 3           2              4              5

使用此数据:

df = read.table(text = 'conflict_ID country_code   SideA
1              1             1          
1              2             1 
1              3             0
2              4             1
2              5             0 ', header = T)

这扩展了您发布的上一期。您可以为每个conflict_ID生成所有组合,并过滤掉country_code_2country_codeSideA == 1匹配的组合。

library(dplyr)
library(tidyr)
mydf %>%
group_by(conflict_ID) %>%
summarise(country_code = combn(country_code, 2, sort, simplify = FALSE),
.groups = 'drop') %>%
unnest_wider(country_code, names_sep = '_') %>%
anti_join(filter(mydf, SideA == 1),
by = c("conflict_ID", "country_code_2" = "country_code"))
# # A tibble: 3 × 3
#   conflict_ID country_code_1 country_code_2
#         <int>          <int>          <int>
# 1           1              1              3
# 2           1              2              3
# 3           2              4              5

最新更新