在R data.table中,可以根据具有多个条件的其他列中的元素有条件地移除行



如何根据data.table中的行删除data.table的行?可复制示例:

要创建数据表,请运行此

library(data.table)
DT <- data.table(Subject=c("A","A","B","B"), Test=c("TEST_A","TEST_A","TEST_A","TEST_A"), Folder=c("D1","Screen","D1","Screen"), Date=as.Date(c("2001-10-22","2001-10-23","2001-10-23","2001-10-25")))
DT[3, Date := NA]
DT

这是一个非常大的数据集的小例子这是我的逻辑:对于每个主题,删除TEST_A&文件夹";D1〃;没有日期,但TEST_A&文件夹";屏幕";有个约会。我知道我可以只写一个if语句,但为了可读性,我试图只使用data.table来保持这一点。

试试这个:

DT[, .SD[ !(any(is.na(Date) & Test == "TEST_A" & Folder == "D1") && 
any(!is.na(Date) & Test == "TEST_B" & Folder == "Screen")), ], by = Subject]

您的示例数据不包含您指定的条件,因此这在此处没有任何作用,但应该使用更具代表性的数据。


以下是一些样本数据,其中包括两个受试者:一个受试人的D1有效(提供日期(,另一个受测人的D1无效(没有日期(。

DT <- data.table(Subject=c("A","A","B","B"), Test=c("TEST_A","TEST_B","TEST_A","TEST_B"), Folder=c("D1","Screen","D1","Screen"), Date=as.Date(c("2001-10-22","2001-10-23","2001-10-23","2001-10-25")))
DT[3, Date := NA]
DT
#    Subject   Test Folder       Date
#     <char> <char> <char>     <Date>
# 1:       A TEST_A     D1 2001-10-22
# 2:       A TEST_B Screen 2001-10-23
# 3:       B TEST_A     D1       <NA>
# 4:       B TEST_B Screen 2001-10-25

和上面的代码(不变(:

DT[, .SD[ !(any(is.na(Date) & Test == "TEST_A" & Folder == "D1") && 
any(!is.na(Date) & Test == "TEST_B" & Folder == "Screen")), ],
by = Subject]
#    Subject   Test Folder       Date
#     <char> <char> <char>     <Date>
# 1:       A TEST_A     D1 2001-10-22
# 2:       A TEST_B Screen 2001-10-23

最新更新