我有冲突数据,看起来像这个
conflict_ID country_code SideA
1 1 1
1 2 1
1 3 0
2 4 1
2 5 0
现在我想把它变成这样的二元冲突数据(SideA=1应该是country_code_1
(:
conflict_ID country_code_1 country_code_2
1 1 3
1 2 3
2 4 5
有人能给我指正确的方向吗?
这里有一个直接的方法:
df %>%
filter(SideA == 1) %>%
select(conflict_ID, country_code_1 = country_code) %>%
left_join(
df %>%
filter(SideA == 0) %>%
select(conflict_ID, country_code_2 = country_code),
by = "conflict_ID"
)
# conflict_ID country_code_1 country_code_2
# 1 1 1 3
# 2 1 2 3
# 3 2 4 5
使用此数据:
df = read.table(text = 'conflict_ID country_code SideA
1 1 1
1 2 1
1 3 0
2 4 1
2 5 0 ', header = T)
这扩展了您发布的上一期。您可以为每个conflict_ID
生成所有组合,并过滤掉country_code_2
与country_code
和SideA == 1
匹配的组合。
library(dplyr)
library(tidyr)
mydf %>%
group_by(conflict_ID) %>%
summarise(country_code = combn(country_code, 2, sort, simplify = FALSE),
.groups = 'drop') %>%
unnest_wider(country_code, names_sep = '_') %>%
anti_join(filter(mydf, SideA == 1),
by = c("conflict_ID", "country_code_2" = "country_code"))
# # A tibble: 3 × 3
# conflict_ID country_code_1 country_code_2
# <int> <int> <int>
# 1 1 1 3
# 2 1 2 3
# 3 2 4 5