我正在处理一个数据帧。我删除了某一列中的NA,方法是:df <- df[-c(which(is.na(df$column))),]
此外,我使用以下方法删除了此列中的空值:df <- df[df$column != "", ]
两个命令都运行良好。然而,我还有三个空值。我想知道这个值是否是空格,但df <- df[df$column != "\s+",]
不会删除这些值。有人知道这些空列中的值是多少吗?
EDIT:DF 示例
dput(df)
structure(list(feature = c("Sorge", "Planung", "genervt", "Liebe",
"Neugier", "überreagieren", "Blockade", "Registrieren", "Wärme",
"Barriere", "Wärme", "Glück", "müde", "Neugier", "Selbsthass",
"anstrengend", "Leidenschaft", "Selbstzufriedenheit", "Selbstzfriedenheit",
"Turteltaube", "vermeiden", "Ruhe", "Enttäuschung", "bildlich",
"Entspannung", "Bescheidenheit", "Überraschung", "ungeduldig",
"verstecken", "Planung", " ", " Angst", " Zukunft", " verwundert",
" Berührung"), word = c("Vul", "Neif", "Wumeizauch", "Häugnung",
"Wupforau", "Bismirbiel", "Enkmitas", "Mege", "Weforshank", "Plüpp",
"Skibt", "Namistell", "Zimerhubst", "Struk", "Mölauzegt", "Bingsemöl",
"Iberletsch", "Troff", "Odef", "Faube", "Wunicher", "Bisknirgo",
"Ferandsor", "Zwelde", "Herklögen", "Preier", "Muschürdur", "Ismiprämpf",
"Glühm", "Rugliebast", "Muschürdur", "Vul", "Neif", "Wumeizauch",
"Häugnung"), code = c("emo", "neu", "neu", "emo", "neu", "neu",
"neu", "neu", "emo", "emo", "emo", "emo", "neu", "emo", "emo",
"emo", "emo", "emo", "emo", "emo", "neu", "neu", "emo", "neu",
"neu", "neu", "neu", "neu", "emo", "neu", "neu", "emo", "neu",
"neu", "emo"), trials_fp.thisIndex = c(0L, 1L, 10L, 11L, 12L,
13L, 14L, 15L, 16L, 17L, 18L, 19L, 2L, 20L, 21L, 22L, 23L, 24L,
25L, 26L, 27L, 28L, 29L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 6L, 0L,
1L, 10L, 11L), condition = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Control",
"Imagery"), class = "factor"), fpt_rt = c(35.666, 50.282, 42.298,
63.651, 44.298, 48.083, 59.149, 38.318, 29.734, 69.368, 46.867,
43.374, 34.367, 42.766, 36.517, 45.999, 34.138, 32.934, 40.366,
64.555, 44.933, 76.487, 66.467, 48.583, 34.918, 39.918, 37.388,
42.915, 44.482, 35.151, 37.388, 35.666, 50.282, 42.298, 63.651
)), row.names = c(NA, -35L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x00000263bedd1ef0>)
我所说的专栏是";特征";。Rowname 31是导致问题的值之一。
答案:df[df$feature!=",]
感谢dput命令让我发现了!!!
df <- df[-c(which(is.na(df$column))),]
从数据帧中删除整个行。也许是你想要的,但我把它放在这里是为了清楚。
df <- df[column != "\s+",]
仅在存在名为column
的附加变量的情况下有效。在您的代码中,column
不而引用data.frame.中的列
column != "\s+"
不执行而执行正则表达式匹配。\s+
是一个正则表达式,AFAIK与空白(或仅字符串(匹配。要在R中使用正则表达式,请使用例如
grepl("\s+", column)
其将返回模式匹配的CCD_ 10。在你的情况下,否定它。例如:
df[!grepl("\s+", df$column), ]