如何通过使用字符变量的起始字母和 r 中的数字来选择和替换字符变量的多个值



这是我的数据集,具有以下值:

df=as.data.table(c("hello","name","age","hey","apron","street","night","soap"))
colnames(df)="V1"
Output:
V1
1  2
2  4
3  1
4  2
5  1
6  3
7  4
8  3

这仅适用于 1 个字母,即 a 或 h 或 s 或 n

df %>%
mutate(V1 = case_when(startsWith(df$V1, "a") == TRUE~ '1',
startsWith(df$V1, "h") == TRUE~ '2',
startsWith(df$V1, "s") == TRUE~ '3',
startsWith(df$V1, "n") == TRUE~ '4'))
V1 V2
1  hello  2
2   name  4
3    age  1
4    hey  2
5  apron  1
6 street  3
7  night  4
8   soap  3

我想替换多个值: 例如,我想将范围 A-H 之间的单词替换为 1 在这里我得到的是 NA 值

df %>%
mutate(V2 = case_when(startsWith(df$V1, letters[1:8]) == TRUE~ '1',
startsWith(df$V1, "s") == TRUE~ '3',
startsWith(df$V1, "n") == TRUE~ '4'))
V1   V2
1  hello <NA>
2   name    4
3    age <NA>
4    hey <NA>
5  apron <NA>
6 street    3
7  night    4
8   soap    3
library(dplyr)
df %>%
mutate(V2 = case_when(substr(V1, 1, 1) %in% letters[1:8] ~ "1",
substr(V1, 1, 1) == "s" ~ "3",
substr(V1, 1, 1) == "n" ~ "4"))

使用regex方法

df%>%mutate(V2 = case_when(grepl("^[a-h].*",V1)~"1",
grepl("^s.*",V1)~"3",
grepl("^n.*",V1)~"4"))
V1 V2
1  hello  1
2   name  4
3    age  1
4    hey  1
5  apron  1
6 street  3
7  night  4
8   soap  3

最新更新