r语言 - 保留每组观测值最少的行

  • 本文关键字:组观 r语言 保留 r row subset
  • 更新时间 :
  • 英文 :


使用以下数据,我想删除所有仅包含小于或等于 1、1 或 2s 的行。日期集包含 1 或 2。

mydata
X1 X2 X3 X4 X5 X6 X7
1  1  2  2  1  1  2  2
2  2  2  2  1  2  2  2
3  1  1  1  1  2  2  2
4  2  1  2  1  2  2  1
5  2  1  1  1  1  1  1
6  1  1  1  1  1  1  1
7  2  2  2  2  2  2  2

删除行 #2,5,6 和 7,因为

sum(mydata[2,]=="1") #2nd row contains only one 1.
sum(mydata[5,]=="2") #5th row contains only one 2.
sum(mydata[6,]=="2") #6th row contains only no 2.
sum(mydata[7,]=="1") #7th row contains only no 1

感谢您的帮助。

d[rowSums(d == 1) > 1 & rowSums(d == 2) > 1,]
#  X1 X2 X3 X4 X5 X6 X7
#1  1  2  2  1  1  2  2
#3  1  1  1  1  2  2  2
#4  2  1  2  1  2  2  1

一种选择是遍历行获取table并检查所有元素的频率是否大于 1(以防万一每行有更多数量的唯一元素(

mydata[apply(mydata, 1, function(x) all(table(factor(x, levels = 1:2)) >1)),]
#. X1 X2 X3 X4 X5 X6 X7
#1  1  2  2  1  1  2  2
#3  1  1  1  1  2  2  2
#4  2  1  2  1  2  2  1

数据

mydata <- structure(list(X1 = c(1L, 2L, 1L, 2L, 2L, 1L, 2L), X2 = c(2L, 
2L, 1L, 1L, 1L, 1L, 2L), X3 = c(2L, 2L, 1L, 2L, 1L, 1L, 2L), 
X4 = c(1L, 1L, 1L, 1L, 1L, 1L, 2L), X5 = c(1L, 2L, 2L, 2L, 
1L, 1L, 2L), X6 = c(2L, 2L, 2L, 2L, 1L, 1L, 2L), X7 = c(2L, 
2L, 2L, 1L, 1L, 1L, 2L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7"))

最新更新