我正在尝试使用r来替换基于我的数据中的其他2列的列中的文本
我有这些列的数据:
Id City Street Street_Type Street_Category
1 Dallas State Route 315 Street Street
2 Dallas State Route 82 State Highways Street
3 SF State St Street Street
4 NY city Corss St Street Street
5 SD Steven Pkwy Street Street
6 LA Harlem Pkwy Parkway Parkway
我希望我的数据看起来像:
Id City Street Street_Type Street_Category
1 Dallas State Route 315 Street State Highways
2 Dallas State Route 82 State Highways State Highways
3 SF State St Street Street
4 NY City Corss St Street Street
5 SD Steven Pkwy Street Parkway
6 LA Harlem Pkwy Parkway Parkway
我想对现有列Street_Category进行更改如果列街有文字"国道";并且列Street_Type有文本"street",我们将Street_Category中的文本替换为"State highway";如果列街有文字"Pkwy"并且列Street_Type有文本"street",我们将Street_Category中的文本替换为"parkway"。
我有一个大的数据集,有不同的值,需要替换类似于这个例子。我如何在所有数据集中做到这一点呢?此外,我还想考虑大小写敏感性。例如,我不想更改"State"的"Street_Category"。到"国家公路";因为它有"州"这个词;。
我使用这段代码来创建Street_Type列,但是它导致了这个错误的Street_Category分类。
df$Street_Type <- g %>%
mutate(Street = case_when(
str_detect(Street,"St") ~ "Street",
str_detect(Street," State Route") ~ "State Highways",
str_detect(Street,"Route") ~ "State Highways",
str_detect(Street,"Pkwy") ~ "Parkway",
TRUE ~ "No type"
)
但是它给了我这个第一个输出,我尝试了这个代码来替换基于2个不同的列在这个链接的答案下面的列:
df[Street == " State Route" & Street_Type == "Street", Street_Category == "State Highways"]
df[Street == " Pkwy" & Street_Type == "Street", Street_Category == "Parkway"]
但是我得到错误信息:
Error in `[.data.frame`(df, Street == " State Route" & Street_Type == :
object 'Street_Category' not found
我在这里错过了什么?如果你能指出我在这里犯的错误,我将不胜感激。
你可以试试-
library(dplyr)
library(stringr)
df %>%
mutate(Street_Category = case_when(
str_detect(Street, 'State Route') & Street_Type == 'Street' ~ "State Highways",
str_detect(Street, 'Pkwy') & Street_Type == 'Street' ~ "Parkway",
TRUE ~ Street_Category))
# Id City Street Street_Type Street_Category
#1 1 Dallas State Route 315 Street State Highways
#2 2 Dallas State Route 82 State Highways Street
#3 3 SF State St Street Street
#4 4 NY city Corss St Street Street
#5 5 SD Steven Pkwy Street Parkway
#6 6 LA Harlem Pkwy Parkway Parkway
如果您以可重复的格式提供数据,则更容易提供帮助
df <- structure(list(Id = 1:6, City = c("Dallas", "Dallas", "SF", "NY city",
"SD", "LA"), Street = c("State Route 315", "State Route 82",
"State St", "Corss St", "Steven Pkwy", "Harlem Pkwy"), Street_Type = c("Street",
"State Highways", "Street", "Street", "Street", "Parkway"), Street_Category = c("Street",
"Street", "Street", "Street", "Street", "Parkway")), row.names = c(NA, -6L), class = "data.frame")