我有两个数据集要合并到R中,但我似乎无法实现。请参阅下面的数据集1(技术(和数据集2(选项(。数据集1需要将额外数据作为列添加到数据集2(例如选项1成本、选项1排放量等(。
但是,合并不是一对一的,因为不是数据集1中的所有行都需要数据集2中的数据。数据集1中的是/否值表明了这一点。
Dataset1
Technology Level Costs Emissions Investment Option 1 Option 2
Technology1 Low 0.9 0.3 0.4 Yes No
Technology1 Medium 0.1 0.8 0.0 Yes No
Technology1 High 0.5 0.6 0.7 Yes No
Technology2 Low 1.0 0.6 0.2 Yes Yes
Technology2 Medium 0.3 0.1 0.2 Yes Yes
Technology3 High 0.7 0.8 0.5 Yes Yes
Dataset2
Option Detail Costs Emissions Investment
Option1 Low 0.7 0.6 0.7
Option1 Medium 0.8 0.9 0.1
Option1 High 0.7 0.6 0.2
Option2 Low 0.8 0.8 0.7
Option2 Medium 0.2 0.6 0.5
Option2 High 0.1 0.8 0.0
Final result:
Technology Level Costs Emissions Investment Option 1 costs Option 1 Emissions Option 1 investment Option 2 costs Option 2 Emissions Option 2 investment
Technology 1 Low 0.5 0.3 0.2 0.5 0.8 0.1 0.0 0.0 0.0
Technology 1 Medium 0.2 1.0 0.1 0.2 0.7 0.6 0.0 0.0 0.0
Technology 1 High 0.5 0.2 0.1 1.0 0.4 0.3 0.0 0.0 0.0
Technology 2 Low 0.7 1.0 1.0 0.8 0.9 0.5 0.2 0.2 0.6
Technology 2 Medium 0.8 0.8 0.1 0.1 0.4 1.0 0.6 0.9 0.7
Technology 3 High 0.8 1.0 0.5 0.6 1.0 0.1 0.4 0.9 0.5
看来if/else在这里行得通,但我似乎不明白if-else在这儿行得通。
谢谢你的帮助!
代码
Dataset1 %>%
mutate(help1 = ifelse(`Option 1` == "Yes", "Option1", NA_character_),
help2 = ifelse(`Option 2` == "Yes", "Option2", NA_character_)) %>%
left_join(Dataset2, by = c("help1" = "Option", "Level" = "Detail"),
suffix = c("","_Option1")) %>%
left_join(Dataset2, by = c("help2" = "Option", "Level" = "Detail"),
suffix = c("","_Option2")) %>%
select(-help1, -help2)
输出
Technology Level Costs Emissions Investment Option 1 Option 2 Costs_Option1 Emissions_Option1
1 Technology1 Low 0.9 0.3 0.4 Yes No 0.7 0.6
2 Technology1 Medium 0.1 0.8 0.0 Yes No 0.8 0.9
3 Technology1 High 0.5 0.6 0.7 Yes No 0.7 0.6
4 Technology2 Low 1.0 0.6 0.2 Yes Yes 0.7 0.6
5 Technology2 Medium 0.3 0.1 0.2 Yes Yes 0.8 0.9
6 Technology3 High 0.7 0.8 0.5 Yes Yes 0.7 0.6
Investment_Option1 Costs_Option2 Emissions_Option2 Investment_Option2
1 0.7 NA NA NA
2 0.1 NA NA NA
3 0.2 NA NA NA
4 0.7 0.8 0.8 0.7
5 0.1 0.2 0.6 0.5
6 0.2 0.1 0.8 0.0
数据集
Dataset1 <- structure(list(Technology = structure(c(1L, 1L, 1L, 2L, 2L, 3L
), .Label = c("Technology1", "Technology2", "Technology3"), class = "factor"),
Level = structure(c(2L, 3L, 1L, 2L, 3L, 1L), .Label = c("High",
"Low", "Medium"), class = "factor"), Costs = c(0.9, 0.1,
0.5, 1, 0.3, 0.7), Emissions = c(0.3, 0.8, 0.6, 0.6, 0.1,
0.8), Investment = c(0.4, 0, 0.7, 0.2, 0.2, 0.5), `Option 1` = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = "Yes", class = "factor"), `Option 2` = structure(c(1L,
1L, 1L, 2L, 2L, 2L), .Label = c("No", "Yes"), class = "factor")), class = "data.frame", row.names = c(NA,
-6L))
Dataset2 <- structure(list(Option = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("Option1",
"Option2"), class = "factor"), Detail = structure(c(2L, 3L, 1L,
2L, 3L, 1L), .Label = c("High", "Low", "Medium"), class = "factor"),
Costs = c(0.7, 0.8, 0.7, 0.8, 0.2, 0.1), Emissions = c(0.6,
0.9, 0.6, 0.8, 0.6, 0.8), Investment = c(0.7, 0.1, 0.2, 0.7,
0.5, 0)), class = "data.frame", row.names = c(NA, -6L))