R-从右边删除所有数字字符，直到到达非数字字符为止

我正在尝试清除此格式的地址1804 E Osage Rd DERBY KS 670378863或55 Cabela Dr GARNER NC 27529如图所示，地址末尾的邮政编码不一致，我想从右侧删除地址的数字部分。在excel中，我可以使用=LEFT(A2, Len(A2)-x))，但它仍然不好，因为x不是基于字符串中数字字符的长度而可变的。

如何使用R或regex从右边删除所有数字字符，直到到达非数字字符？

预期输出看起来像-

raw_Address	clean_Address
1804 E Osage Rd DERBY KS 670378863	1804 E Osage Road DERBY KS
55 Cabela Dr GARNER NC 27529	55Cabela Dr.GARNER NC

我们可以使用base R中的trimws-匹配一个或多个空白，后跟一个或更多数字，以删除右侧处的数字

df1$clean_Address <- trimws(df1$raw_Address, whitespace = "\s+\d+")

-输出

> df1
raw_Address            clean_Address
1 1804 E Osage Rd DERBY KS 670378863 1804 E Osage Rd DERBY KS
2       55 Cabela Dr GARNER NC 27529   55 Cabela Dr GARNER NC

数据

df1 <- structure(list(raw_Address = c("1804 E Osage Rd DERBY KS 670378863", 
"55 Cabela Dr GARNER NC 27529")), row.names = c(NA, -2L), class = "data.frame")

使用｛stringr｝

raw_Address <-  c("1804 E Osage Rd DERBY KS 670378863", "55 Cabela Dr 
GARNER NC 27529")
library(stringr)
str_replace(raw_Address, "\s\d+$", "")
#or even more simply
str_remove(raw_Address, "\s\d+$")
#> [1] "1804 E Osage Rd DERBY KS" "55 Cabela Dr GARNER NC"

^{创建于2022-03-18由reprex包(v2.0.1(}

数据

相关内容

最新更新

热门标签：