我有一个简单的数据帧:dput(emp)
structure(list(name = structure(1L, .Label = "Alex", class = "factor"),
job = structure(1L, .Label = "", class = "factor"), Mgr = structure(1L, .Label = "", class = "factor"),
update = structure(18498, class = "Date")), class = "data.frame", row.names = c(NA,
-1L))
我想将所有空行转换为NULL
最简单的实现方法是:
emp[emp==""] <- NA
当然哪一个会起作用,但我得到日期列的错误为:
Error in charToDate(x) :
character string is not in a standard unambiguous format
如何在不处理日期列的情况下将所有其他空行转换为NULL?请注意,实际的数据帧有30000多行。
尝试将日期变量格式化为字符,进行更改并再次转换为日期:
#Format date
emp$update <- as.character(emp$update)
#Replace
emp[emp=='']<-NA
#Reformat date
emp$update <- as.Date(emp$update)
输出:
name job Mgr update
1 Alex <NA> <NA> 2020-08-24
您可以像下面的一样尝试type.convert
type.convert(emp,as.is = TRUE)
使得
name job Mgr update
1 Alex NA NA 2020-08-24
您可以使用dplyr
:进行尝试
library(dplyr)
df %>%
mutate_at(vars(update),as.character) %>%
na_if(.,"")
正如@Duck所提到的,您必须将date
变量格式化为字符。
之后,如果需要,您可以将其转换回date
:
library(dplyr)
df %>%
mutate_at(vars(update),as.character) %>%
na_if(.,"") %>%
mutate_at(vars(update),as.Date)
看看这是否有效:
> library(dplyr)
> library(purrr)
> emp <- structure(list(name = structure(1L, .Label = "Alex", class = "factor"),
+ job = structure(1L, .Label = "", class = "factor"), Mgr = structure(1L, .Label = "", class = "factor"),
+ update = structure(18498, class = "Date")), class = "data.frame", row.names = c(NA,
+ -1L))
> emp
name job Mgr update
1 Alex 2020-08-24
> emp %>% mutate(update = as.character(update)) %>% map_df(~gsub('^$',NA, .x)) %>% mutate(update = as.Date(update)) %>% mutate(across(1:3, as.factor))
# A tibble: 1 x 4
name job Mgr update
<fct> <fct> <fct> <date>
1 Alex NA NA 2020-08-24
>