r语言 - 有条件地聚合不包含特定值的数据帧行



在 df 中,我只想保留那些intersect_streetstreets中包含的街道名称匹配的行,同时将已删除行的intersection_distance_meters添加到其上方的行中。

东风

> streets
[1] "FRONT ST" "2ND ST"   "3RD ST"   "4TH ST"  
> df
intersection segment_key intersection_distance_meters intersect_street
1       ARCH ST & FRONT ST         1EW                           81         FRONT ST
2     ARCH ST & MASCHER ST         2EW                           60       MASCHER ST
3         ARCH ST & 2ND ST         3EW                           57           2ND ST
4 ARCH ST & LITTLE BOYS CT         4EW                           28   LITTLE BOYS CT
5       ARCH ST & BREAD ST         5EW                           83         BREAD ST
6         ARCH ST & 3RD ST         6EW                          135           3RD ST
7         ARCH ST & 4TH ST         7EW                          144           4TH ST

期望的输出

intersection segment_key intersection_distance_meters intersect_street
1       ARCH ST & FRONT ST         1EW                          141         FRONT ST
2         ARCH ST & 2ND ST         3EW                          168           2ND ST
3         ARCH ST & 3RD ST         6EW                          135           3RD ST
4         ARCH ST & 4TH ST         7EW                          144           4TH ST

我一直在使用 dplyr 的lead()将下一行的intersect_streetintersection_distance_meters添加为新列,然后有条件地将它们相加,但是当一行中有多个非主要交叉点时,我遇到了问题(例如,上面的第 4 行和第 5 行(。

数据

df <- structure(list(intersection = c("ARCH ST & FRONT ST", "ARCH ST & MASCHER ST", 
"ARCH ST & 2ND ST", "ARCH ST & LITTLE BOYS CT", "ARCH ST & BREAD ST", 
"ARCH ST & 3RD ST", "ARCH ST & 4TH ST"), segment_key = c("1EW", 
"2EW", "3EW", "4EW", "5EW", "6EW", "7EW"), intersection_distance_meters = c(81, 
60, 57, 28, 83, 135, 144), intersect_street = c("FRONT ST", "MASCHER ST", 
"2ND ST", "LITTLE BOYS CT", "BREAD ST", "3RD ST", "4TH ST")), row.names = c(NA, 
7L), class = "data.frame")
streets <- c("FRONT ST", "2ND ST", "3RD ST", "4TH ST")

我认为这就是你想要的。我创建了一些额外的帮助程序列---我将它们保留下来,以便逻辑清晰。

df %>% mutate(
keep = intersect_street %in% streets,
grouper = cumsum(keep)
) %>%
group_by(grouper) %>%
mutate(total_intersection_dist = sum(intersection_distance_meters)) %>%
slice(1)
# # A tibble: 4 x 7
# # Groups:   grouper [4]
#   intersection       segment_key intersection_distance_met~ intersect_street keep  grouper total_intersection_di~
#   <chr>              <chr>                            <dbl> <chr>            <lgl>   <int>                  <dbl>
# 1 ARCH ST & FRONT ST 1EW                                 81 FRONT ST         TRUE        1                    141
# 2 ARCH ST & 2ND ST   3EW                                 57 2ND ST           TRUE        2                    168
# 3 ARCH ST & 3RD ST   6EW                                135 3RD ST           TRUE        3                    135
# 4 ARCH ST & 4TH ST   7EW                                144 4TH ST           TRUE        4                    144

相关内容

最新更新