r-两个数据帧之间的字典式条件引用



我有两个数据帧。DF1是一份凶杀案列表,每行都附有日期和地点。DF2由DF1中提到的一组共享位置组成。

DF2包含每个唯一位置的纬度和经度。我想把这些拔出来。注:DF2包含共享位置,这可能对应于DF1中的多起凶杀案,这意味着两个DFs的长度不同。

当DF2中的位置等于DF1中的位置时,我想在DF1中创建纬度和经度变量(假设位置名称在两个DFs之间精确(。如何从DF2中提取纬度和经度,DF2中的位置对应于DF1中给定的凶杀记录?

可重复的小示例:

DF1:(事件数据帧(

| Incident  | Place  |
| --------  | -------|
| Incident 1| Place 1|
| Incident 2| Place 2|
| Incident 3| Place 2|
| Incident 4| Place 3|
| Incident 5| Place 1|
| Incident 6| Place 3|
| Incident 7| Place 2|

DF2:(字典式拉丁语手册(

| Place  |Latitude |Longitude |
| -------| ------- | ---------|
| Place 1| A       | B        |
| Place 2| C       | D        |
| Place 3| E       | F        |
| Place 4| G       | H        |

DF3(我想要的(

| Incident | Latitude | Longitude |
| -------- | -------- | --------- |
|Incident 1| A        | B         |
|Incident 2| C        | D         |
|Incident 3| C        | D         |
|Incident 4| E        | F         |
|Incident 5| A        | B         |
|Incident 6| E        | F         |
|Incident 7| C        | D         |

我试过:

DF1$latitude <- DF2$latitude[which(DF2$location == DF1$location), ]

它返回了以下错误:

Error in DF2$latitude[which(DF2$location == DF1$location), ] : 
incorrect number of dimensions
In addition: Warning message:
In DF2$location == DF1$location :
longer object length is not a multiple of shorter object length

作为对评论建议的回应,我还尝试了:

DF2$latitude[which(DF2$location == DF1$location)]

然而,我得到了错误:

Error in `$<-.data.frame`(`*tmp*`, latitude, value = numeric(0)) : 
replacement has 0 rows, data has 1220
In addition: Warning message:
In DF1$location == DF2$location :
longer object length is not a multiple of shorter object length

您可以尝试dplyr的left_join((。下面的代码保留DF1中的所有行,如果在location中找到匹配项,则在DF2中添加变量。

library(dplyr)
DF3 <- left_join(DF1, DF2, by = "location")

最新更新