我正在尝试将一个名为"state_color"的新变量添加到数据集"birth_data"。我想用 dplyr 简化我的代码,但我不确定如何转换它。我知道在基础 R 中它看起来像这样:
red <- c("AK","ID","KS","NE","ND","OK","UT","WY","TX","AL","MS","SC","MT","GA","MO","LA","TN","AK","KY","AZ","WV")
purple <- c("NC","VA","FL","OH","CO","NV","IN","IA","NM")
blue <- c("NH","PA","CA","MI","IL","MA","DE","NJ","CT","VT","ME","WA","OR","WI","NY","MA","RI","HI","MN","DC")
#assigning states to their respective color
birth_data$state_color[birth_data$state %in% red] <- "red"
birth_data$state_color[birth_data$state %in% purple] <- "purple"
birth_data$state_color[birth_data$state %in% blue] <- "blue"
head(birth_data)
我尝试用 dplyr 做同样的事情
red <- c("AK","ID","KS","NE","ND","OK","UT","WY","TX","AL","MS","SC","MT","GA","MO","LA","TN","AK","KY","AZ","WV")
purple <- c("NC","VA","FL","OH","CO","NV","IN","IA","NM")
blue <- c("NH","PA","CA","MI","IL","MA","DE","NJ","CT","VT","ME","WA","OR","WI","NY","MA","RI","HI","MN","DC")
#assigning states to their respective color
birth_data %>%
mutate(state_color <- c("red","purple","blue"))
但随后出现错误
错误:列
state_color <- c("red", "purple", "blue")
的长度必须为 1103629(行数(或 1,而不是 3
我做错了什么?
您想从数据集开始,然后更改以创建新列,但随后使用"case_when"。如果所有事例都失败,则使用 TRUE 值。
red <- c("AK","ID","KS","NE","ND","OK","UT","WY","TX","AL","MS","SC","MT","GA","MO","LA","TN","AK","KY","AZ","WV")
purple <- c("NC","VA","FL","OH","CO","NV","IN","IA","NM")
blue <- c("NH","PA","CA","MI","IL","MA","DE","NJ","CT","VT","ME","WA","OR","WI","NY","MA","RI","HI","MN","DC")
birth_data %>%
mutate(state_color =case_when(
state %in% red ~ "red",
state %in% purple ~ "purple",
state %in% blue ~ "blue",
TRUE ~ "no color"
))
试试这个:
birth_data %>% mutate(state_color=if_else(状态 %in% 红色, "红色", if_else(状态 %in% 紫色, "紫色","蓝色"((
如果你想从base扩展,你也应该看看data.table:
library(data.table)
dt_states <- data.table(state = state.abb)
dt_states[state %in% red, state_color := 'red']
dt_states[state %in% blue, state_color := 'blue']
dt_states[state %in% purple, state_color := 'purple']
dt_states