将代码从基本 R 转换为 dplyr,特别是添加一个变量



我正在尝试将一个名为"state_color"的新变量添加到数据集"birth_data"。我想用 dplyr 简化我的代码,但我不确定如何转换它。我知道在基础 R 中它看起来像这样:

red <- c("AK","ID","KS","NE","ND","OK","UT","WY","TX","AL","MS","SC","MT","GA","MO","LA","TN","AK","KY","AZ","WV") 
purple <- c("NC","VA","FL","OH","CO","NV","IN","IA","NM")
blue <- c("NH","PA","CA","MI","IL","MA","DE","NJ","CT","VT","ME","WA","OR","WI","NY","MA","RI","HI","MN","DC")
#assigning states to their respective color
birth_data$state_color[birth_data$state %in% red] <- "red"
birth_data$state_color[birth_data$state %in% purple] <- "purple"
birth_data$state_color[birth_data$state %in% blue] <- "blue"
head(birth_data)

我尝试用 dplyr 做同样的事情

red <- c("AK","ID","KS","NE","ND","OK","UT","WY","TX","AL","MS","SC","MT","GA","MO","LA","TN","AK","KY","AZ","WV") 
purple <- c("NC","VA","FL","OH","CO","NV","IN","IA","NM")
blue <- c("NH","PA","CA","MI","IL","MA","DE","NJ","CT","VT","ME","WA","OR","WI","NY","MA","RI","HI","MN","DC")
#assigning states to their respective color
birth_data %>%
mutate(state_color <- c("red","purple","blue"))

但随后出现错误

错误:列state_color <- c("red", "purple", "blue")的长度必须为 1103629(行数(或 1,而不是 3

我做错了什么?

您想从数据集开始,然后更改以创建新列,但随后使用"case_when"。如果所有事例都失败,则使用 TRUE 值。

red <- c("AK","ID","KS","NE","ND","OK","UT","WY","TX","AL","MS","SC","MT","GA","MO","LA","TN","AK","KY","AZ","WV") 
purple <- c("NC","VA","FL","OH","CO","NV","IN","IA","NM")
blue <- c("NH","PA","CA","MI","IL","MA","DE","NJ","CT","VT","ME","WA","OR","WI","NY","MA","RI","HI","MN","DC")
birth_data %>%
mutate(state_color =case_when(
state %in% red  ~ "red",
state %in% purple  ~ "purple",
state %in% blue  ~ "blue",
TRUE ~ "no color"
))

试试这个:

birth_data %>% mutate(state_color=if_else(状态 %in% 红色, "红色", if_else(状态 %in% 紫色, "紫色","蓝色"((

如果你想从base扩展,你也应该看看data.table:

library(data.table)
dt_states <- data.table(state = state.abb)
dt_states[state %in% red, state_color := 'red']
dt_states[state %in% blue, state_color := 'blue']
dt_states[state %in% purple, state_color := 'purple']
dt_states

最新更新