删除 NA 和空 (" ") 值后,仍为空值

  • 本文关键字:空值 值后 NA 和空 删除 r
  • 更新时间 :
  • 英文 :


我正在处理一个数据帧。我删除了某一列中的NA,方法是:df <- df[-c(which(is.na(df$column))),]

此外,我使用以下方法删除了此列中的空值:df <- df[df$column != "", ]

两个命令都运行良好。然而,我还有三个空值。我想知道这个值是否是空格,但df <- df[df$column != "\s+",]不会删除这些值。有人知道这些空列中的值是多少吗?

EDIT:DF 示例

dput(df)
structure(list(feature = c("Sorge", "Planung", "genervt", "Liebe", 
"Neugier", "überreagieren", "Blockade", "Registrieren", "Wärme", 
"Barriere", "Wärme", "Glück", "müde", "Neugier", "Selbsthass", 
"anstrengend", "Leidenschaft", "Selbstzufriedenheit", "Selbstzfriedenheit", 
"Turteltaube", "vermeiden", "Ruhe", "Enttäuschung", "bildlich", 
"Entspannung", "Bescheidenheit", "Überraschung", "ungeduldig", 
"verstecken", "Planung", " ", " Angst", " Zukunft", " verwundert", 
" Berührung"), word = c("Vul", "Neif", "Wumeizauch", "Häugnung", 
"Wupforau", "Bismirbiel", "Enkmitas", "Mege", "Weforshank", "Plüpp", 
"Skibt", "Namistell", "Zimerhubst", "Struk", "Mölauzegt", "Bingsemöl", 
"Iberletsch", "Troff", "Odef", "Faube", "Wunicher", "Bisknirgo", 
"Ferandsor", "Zwelde", "Herklögen", "Preier", "Muschürdur", "Ismiprämpf", 
"Glühm", "Rugliebast", "Muschürdur", "Vul", "Neif", "Wumeizauch", 
"Häugnung"), code = c("emo", "neu", "neu", "emo", "neu", "neu", 
"neu", "neu", "emo", "emo", "emo", "emo", "neu", "emo", "emo", 
"emo", "emo", "emo", "emo", "emo", "neu", "neu", "emo", "neu", 
"neu", "neu", "neu", "neu", "emo", "neu", "neu", "emo", "neu", 
"neu", "emo"), trials_fp.thisIndex = c(0L, 1L, 10L, 11L, 12L, 
13L, 14L, 15L, 16L, 17L, 18L, 19L, 2L, 20L, 21L, 22L, 23L, 24L, 
25L, 26L, 27L, 28L, 29L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 6L, 0L, 
1L, 10L, 11L), condition = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Control", 
"Imagery"), class = "factor"), fpt_rt = c(35.666, 50.282, 42.298, 
63.651, 44.298, 48.083, 59.149, 38.318, 29.734, 69.368, 46.867, 
43.374, 34.367, 42.766, 36.517, 45.999, 34.138, 32.934, 40.366, 
64.555, 44.933, 76.487, 66.467, 48.583, 34.918, 39.918, 37.388, 
42.915, 44.482, 35.151, 37.388, 35.666, 50.282, 42.298, 63.651
)), row.names = c(NA, -35L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x00000263bedd1ef0>)

我所说的专栏是";特征";。Rowname 31是导致问题的值之一。

答案:df[df$feature!=",]

感谢dput命令让我发现了!!!

df <- df[-c(which(is.na(df$column))),]从数据帧中删除整个。也许是你想要的,但我把它放在这里是为了清楚。

df <- df[column != "\s+",]仅在存在名为column的附加变量的情况下有效。在您的代码中,column引用data.frame.中的列

column != "\s+"不执行执行正则表达式匹配。\s+是一个正则表达式,AFAIK与空白(或仅字符串(匹配。要在R中使用正则表达式,请使用例如

grepl("\s+", column)

其将返回模式匹配的CCD_ 10。在你的情况下,否定它。例如:

df[!grepl("\s+", df$column), ]

最新更新