>我有一个包含 12000 行和 35 列的数据帧,在不同的行或列中有多个 NA。
我想创建一个ifelse函数来选择并将它们更改为值(如"0"或"9999"(。
我的问题是is.na(dataframe)
似乎不适用于整个数据帧,但我对为每个单独的列进行选择并不真正着迷。
有没有更好的方法?
library(dplyr)
data <- tibble(a = c(1, NA, 2), b = c(NA,1,2)) # let's create some data
data
# A tibble: 3 x 2
a b
<dbl> <dbl>
1 1 NA
2 NA 1
3 2 2
data[is.na(data)] <- 0
data
# A tibble: 3 x 2
a b
<dbl> <dbl>
1 1 0
2 0 1
3 2 2
或与NaN
:
data <- tibble(a = c(1, NaN, 2), b = c(NaN,1,2))
data
# A tibble: 3 x 2
a b
<dbl> <dbl>
1 1 NaN
2 NaN 1
3 2 2
data[is.na(data)] <- 0 # still works the same
data
# A tibble: 3 x 2
a b
<dbl> <dbl>
1 1 0
2 0 1
3 2 2
如果"NA"
为字符串:
data <- tibble(a = c(1, "NA", 2), b = c("NA",1,2))
data[data=="NA"] <- NA # first fix and bring all to "true" NA
data[is.na(data)] <- 0 # still works the same
data
# A tibble: 3 x 2
a b
<dbl> <dbl>
1 1 0
2 0 1
3 2 2
dplyr
解决方案:
对于NA
或NaN
:
df <- tibble(a = c(1, NaN, 2), b = c(NA,1,2))
df %>%
replace(is.na(.), 0)
# A tibble: 3 x 2
a b
<dbl> <dbl>
1 1. 0.
2 0. 1.
3 2. 2.
对于字符串"NA"
或"NaN"
:
df <- tibble(a = c(1, "NaN", 2), b = c("NA",1,2))
df %>%
mutate_all(funs(replace(., .=="NaN", 0))) %>%
mutate_all(funs(replace(., .=="NA", 0))) %>%
mutate_all(funs(as.numeric))
# A tibble: 3 x 2
a b
<dbl> <dbl>
1 1. 0.
2 0. 1.
3 2. 2.