在R-处理日期列中将空值转换为NULL



我有一个简单的数据帧:dput(emp)

structure(list(name = structure(1L, .Label = "Alex", class = "factor"), 
job = structure(1L, .Label = "", class = "factor"), Mgr = structure(1L, .Label = "", class = "factor"), 
update = structure(18498, class = "Date")), class = "data.frame", row.names = c(NA, 
-1L))

我想将所有空行转换为NULL

最简单的实现方法是:

emp[emp==""] <- NA

当然哪一个会起作用,但我得到日期列的错误为:

Error in charToDate(x) : 
character string is not in a standard unambiguous format

如何在不处理日期列的情况下将所有其他空行转换为NULL?请注意,实际的数据帧有30000多行。

尝试将日期变量格式化为字符,进行更改并再次转换为日期:

#Format date
emp$update <- as.character(emp$update)
#Replace
emp[emp=='']<-NA
#Reformat date
emp$update <- as.Date(emp$update)

输出:

name  job  Mgr     update
1 Alex <NA> <NA> 2020-08-24

您可以像下面的一样尝试type.convert

type.convert(emp,as.is = TRUE)

使得

name job Mgr     update
1 Alex  NA  NA 2020-08-24

您可以使用dplyr:进行尝试

library(dplyr)
df %>% 
mutate_at(vars(update),as.character) %>%
na_if(.,"")

正如@Duck所提到的,您必须将date变量格式化为字符。

之后,如果需要,您可以将其转换回date

library(dplyr)
df %>% 
mutate_at(vars(update),as.character) %>%
na_if(.,"") %>%
mutate_at(vars(update),as.Date)

看看这是否有效:

> library(dplyr)
> library(purrr)
> emp <- structure(list(name = structure(1L, .Label = "Alex", class = "factor"), 
+                      job = structure(1L, .Label = "", class = "factor"), Mgr = structure(1L, .Label = "", class = "factor"), 
+                      update = structure(18498, class = "Date")), class = "data.frame", row.names = c(NA, 
+                                                                                                      -1L))
> emp
name job Mgr     update
1 Alex         2020-08-24
> emp %>% mutate(update = as.character(update)) %>% map_df(~gsub('^$',NA, .x)) %>% mutate(update = as.Date(update)) %>% mutate(across(1:3, as.factor))
# A tibble: 1 x 4
name  job   Mgr   update    
<fct> <fct> <fct> <date>    
1 Alex  NA    NA    2020-08-24
> 

最新更新