给定以下数据:
data <- data.frame("Street" = c("Example1", "Example2", "Example3", "Example4"),
"Number" = c("10-20a", "4b-8", NA, "14-16"))
Street Number
1 Example1 10-20a
2 Example2 4b-8
3 Example3 <NA>
4 Example4 14-16
我如何创建一个数据帧,每个单独的地址看起来像这样:
data_long <- data.frame("Street" = c(rep("Example1", 11), rep("Example2", 5),
"Example3", rep("Example4", 3)),
"Number" = c(10:19,"20a", "4b", 5:8, NA, c(14:16)))
Street Number
1 Example1 10
2 Example1 11
3 Example1 12
4 Example1 13
5 Example1 14
6 Example1 15
7 Example1 16
8 Example1 17
9 Example1 18
10 Example1 19
11 Example1 20a
12 Example2 4b
13 Example2 5
14 Example2 6
15 Example2 7
16 Example2 8
17 Example3 <NA>
18 Example4 14
19 Example4 15
20 Example4 16
我尝试将数字和字母分开,然后使用rowwise
和mutate
,但它不起作用,因为然后字母在每个地址旁边。
我们将非na元素filter
,提取一个或多个数字(d+)与字母([a-z])作为mutate中的'v1'列,循环使用map2之间提取的数字,获得序列(:)作为列表列,unnest (from tidyr)列表列扩展数据并根据case_when
中的'v1'列替换'Number'中的值并与NA
行绑定
library(dplyr)
library(stringr)
library(tidyr)
library(purrr)
data %>%
filter(complete.cases(Number)) %>%
mutate(v1 = str_extract(Number, "\d+[a-z]"),
Number = map2(readr::parse_number(Number),
as.numeric(str_extract(Number, "-(\d+)", group = 1)), `:`)) %>%
unnest(Number) %>%
mutate(Number = case_when(str_detect(v1, as.character(Number))
~ v1,
TRUE ~ as.character(Number)), v1 = NULL) %>%
bind_rows(data %>%
filter(is.na(Number))) %>%
arrange(Street)
与产出
# A tibble: 20 × 2
Street Number
<chr> <chr>
1 Example1 10
2 Example1 11
3 Example1 12
4 Example1 13
5 Example1 14
6 Example1 15
7 Example1 16
8 Example1 17
9 Example1 18
10 Example1 19
11 Example1 20a
12 Example2 4b
13 Example2 5
14 Example2 6
15 Example2 7
16 Example2 8
17 Example3 <NA>
18 Example4 14
19 Example4 15
20 Example4 16
这是在@akrun的帮助下生成的另一种方法(这里:如何展开行并在给定的开始和结束之间填充数字):
library(dplyr)
library(tidyr)
library(readr)
data %>%
separate_rows(Number, sep = "-") %>%
group_by(Street) %>%
mutate(Number1 = ifelse(is.na(Number), 0, parse_number(Number))) %>%
tidyr::complete(Number1 = full_seq(Number1, period = 1)) %>%
mutate(Number = coalesce(Number, as.character(Number1)), .keep="unused")
Street Number
<chr> <chr>
1 Example1 10
2 Example1 11
3 Example1 12
4 Example1 13
5 Example1 14
6 Example1 15
7 Example1 16
8 Example1 17
9 Example1 18
10 Example1 19
11 Example1 20a
12 Example2 4b
13 Example2 5
14 Example2 6
15 Example2 7
16 Example2 8
17 Example3 0
18 Example4 14
19 Example4 15
20 Example4 16