如何根据data.table中的行删除data.table的行?可复制示例:
要创建数据表,请运行此
library(data.table)
DT <- data.table(Subject=c("A","A","B","B"), Test=c("TEST_A","TEST_A","TEST_A","TEST_A"), Folder=c("D1","Screen","D1","Screen"), Date=as.Date(c("2001-10-22","2001-10-23","2001-10-23","2001-10-25")))
DT[3, Date := NA]
DT
这是一个非常大的数据集的小例子这是我的逻辑:对于每个主题,删除TEST_A&文件夹";D1〃;没有日期,但TEST_A&文件夹";屏幕";有个约会。我知道我可以只写一个if语句,但为了可读性,我试图只使用data.table来保持这一点。
试试这个:
DT[, .SD[ !(any(is.na(Date) & Test == "TEST_A" & Folder == "D1") &&
any(!is.na(Date) & Test == "TEST_B" & Folder == "Screen")), ], by = Subject]
您的示例数据不包含您指定的条件,因此这在此处没有任何作用,但应该使用更具代表性的数据。
以下是一些样本数据,其中包括两个受试者:一个受试人的D1有效(提供日期(,另一个受测人的D1无效(没有日期(。
DT <- data.table(Subject=c("A","A","B","B"), Test=c("TEST_A","TEST_B","TEST_A","TEST_B"), Folder=c("D1","Screen","D1","Screen"), Date=as.Date(c("2001-10-22","2001-10-23","2001-10-23","2001-10-25")))
DT[3, Date := NA]
DT
# Subject Test Folder Date
# <char> <char> <char> <Date>
# 1: A TEST_A D1 2001-10-22
# 2: A TEST_B Screen 2001-10-23
# 3: B TEST_A D1 <NA>
# 4: B TEST_B Screen 2001-10-25
和上面的代码(不变(:
DT[, .SD[ !(any(is.na(Date) & Test == "TEST_A" & Folder == "D1") &&
any(!is.na(Date) & Test == "TEST_B" & Folder == "Screen")), ],
by = Subject]
# Subject Test Folder Date
# <char> <char> <char> <Date>
# 1: A TEST_A D1 2001-10-22
# 2: A TEST_B Screen 2001-10-23