对于以下数据帧
> df <- data.frame(Country = c("Republic of Ireland", "United Kingdom", "United States of America"))
# Country
# <chr>
# Republic of Ireland
# United Kingdom
# United States of America
有没有一种方法可以让我使用函数(tidyverse风格(来更改国家名称。我还希望能够引用数据框架中的特定列。
这就是我迄今为止所做的:
# c("Old name", "new name")
name_change = list(c("Republic of Ireland", "Ireland"),
c("United Kingdom", "UK"),
c("Russia Moscow", "Russia"),
c("United States of America", "USA"))
name_change_func <- function(vec, data = c2, df_col = Country){
# Expecting vec c("Old name", "new name")
old_n <- vec[1]
new_n <- vec[2]
data %>%
mutate(!!df_col = gsub(old_n, new_n, !!df_col ))
}
map_df(name_change, ~name_change_func(.x)) %>%
group_by(Country) %>%
filter(row_number(Country) == 1)
这是行不通的,但如果我们改变!!df_col直接到Country,它会起作用(有点像,会得到需要过滤掉的重复名称,我们实际上并没有像添加行那样更改名称(。
有办法解决这个问题吗?能够将函数参数用作函数内部的列
如果您知道更好的解决方案,可获得额外积分。
您可以使用命名向量来替换,该向量可以在str_replace_all
中使用。
library(dplyr)
library(stringr)
#c("Old name" = "new name")
name_change = c("Republic of Ireland" = "Ireland",
"United Kingdom" = "UK",
"Russia Moscow" = "Russia",
"United States of America" = "USA")
df %>% mutate(new_country = str_replace_all(Country, name_change))
# Country new_country
#1 Republic of Ireland Ireland
#2 United Kingdom UK
#3 United States of America USA
一个替代方案是tidyverse
中的case_when
。
library(dplyr)
df <- data.frame(Country = c("Republic of Ireland", "United Kingdom", "United States of America"))
df <-
df %>%
dplyr::mutate(NewCountry =
case_when(
Country == "Republic of Ireland" ~ "Ireland",
Country == "United States of America" ~ "US",
Country == "United Kingdom" ~ "UK",
Country == "Russia Moscow" ~ "Russia"
)
)
# Country NewCountry
# 1 Republic of Ireland Ireland
# 2 United Kingdom UK
# 3 United States of America US