r语言 - 根据不同的列或行位置将列值替换为 NA 与 tidyverse



下面是我拥有的更大tibble的较小版本,我想根据bandNumber列中的值或基于它们的行位置将reflectanceSfp和reflectanceDT中的值替换为NA。我想专门用整洁的管道和相关包来解决这个问题。

reflectanceSfp wavelength bandNumber reflectanceDT wavelength1
-0.0113          376       1.00      0.000148         377
-0.000592        381       2.00      0.00589          382
0.0158          386       3.00      0.0101           387
0.0200          391       4.00      0.0110           392
0.0240          396       5.00      0.0117           397
0.0265          401       6.00      0.0149           402

所以我有以下坏乐队列表,这些乐队是我想用 NA 替换的乐队编号:

badBands <- c(1:2,6)

我已经尝试了这种格式的东西,只是为了看看它会做什么

m2 <- myData %>%
mutate(reflectanceSfp = case_when(bandNumber == 1.00 ~ NA ))

但是,最终希望坏带向量在管道中 并试图了解modify_at和mutate_at的使用。

我希望生成的数据集看起来像

reflectanceSfp wavelength bandNumber reflectanceDT wavelength1
NA          376       1.00      0.000148         377
NA          381       2.00      0.00589          382
0.0158          386       3.00      0.0101           387
0.0200          391       4.00      0.0110           392
0.0240          396       5.00      0.0117           397
NA          401       6.00      0.0149           402

下面是我的表格的 dput 版本:

myData <- structure(list(reflectanceSfp = c(-0.011258, -0.000592, 0.015815, 
0.019991, 0.023965, 0.026547), wavelength = c(376.440002, 381.450012, 
386.459991, 391.470001, 396.470001, 401.480011), bandNumber = c(1, 
2, 3, 4, 5, 6), reflectanceDT = c(0.00014819, 0.00589207, 0.01012335, 
0.01101705, 0.01165185, 0.01486412), wavelength1 = c(376.6300049, 
381.6400147, 386.6499939, 391.6600037, 396.6600037, 401.6700134
)), .Names = c("reflectanceSfp", "wavelength", "bandNumber", 
"reflectanceDT", "wavelength1"), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

由于"badBands"length大于1,请使用%in%而不是==case_when也是类型敏感的,因此最好具有正确的NA,即double列的NA_real_

myData %>% 
mutate(reflectanceSfp = case_when(bandNumber %in% badBands ~ NA_real_, 
TRUE ~ reflectanceSfp))
# A tibble: 6 x 5
#  reflectanceSfp wavelength bandNumber reflectanceDT wavelength1
#           <dbl>      <dbl>      <dbl>         <dbl>       <dbl>
#1        NA            376.          1      0.000148        377.
#2        NA            381.          2      0.00589         382.
#3         0.0158       386.          3      0.0101          387.
#4         0.0200       391.          4      0.0110          392.
#5         0.0240       396.          5      0.0117          397.
#6        NA            401.          6      0.0149          402.

或者在这里使用replace更容易,我们只需要指定满足逻辑条件的替换值,而无需类型检查

myData %>%
mutate(reflectanceSfp = replace(reflectanceSfp, 
bandNumber %in% badBands, NA))
myData %>% 
mutate(reflectanceSfp = ifelse(bandNumber %in% badBands, NA, reflectanceSfp))
# A tibble: 6 x 5
reflectanceSfp wavelength bandNumber reflectanceDT wavelength1
<dbl>      <dbl>      <dbl>         <dbl>       <dbl>
1        NA            376.         1.      0.000148        377.
2        NA            381.         2.      0.00589         382.
3         0.0158       386.         3.      0.0101          387.
4         0.0200       391.         4.      0.0110          392.
5         0.0240       396.         5.      0.0117          397.
6        NA            401.         6.      0.0149          402.
myData%>%
mutate(reflectanceSfp=`is.na<-`(reflectanceSfp,badBands))
# A tibble: 6 x 5
reflectanceSfp wavelength bandNumber reflectanceDT wavelength1
<dbl>      <dbl>      <dbl>         <dbl>       <dbl>
1        NA            376.         1.      0.000148        377.
2        NA            381.         2.      0.00589         382.
3         0.0158       386.         3.      0.0101          387.
4         0.0200       391.         4.      0.0110          392.
5         0.0240       396.         5.      0.0117          397.
6        NA            401.         6.      0.0149          402.

最新更新