我有一个数据表DT
;它的某些列名具有模式。我怎样才能简明扼要地(用一行代码)写一些类似的东西呢
DT[pat1>0, pat1:=1]
DT[pat2>0, pat2:=1]
DT[pat3>0, pat3:=1]
玩具数据:
require(data.table)
set.seed(1)
DT <- data.table(id=rnorm(5), pat1=sample(0:3, 5, T), pat2=sample(0:3, 5, T), pat3=sample(0:3, 5, T))
DT
## id pat1 pat2 pat3
## 1: -0.6264538 0 1 3
## 2: 0.1836433 0 2 0
## 3: -0.8356286 2 3 2
## 4: 1.5952808 1 1 0
## 5: 0.3295078 3 3 1
DT[pat1>0, pat1:=1]
DT[pat2>0, pat2:=1]
DT[pat3>0, pat3:=1]
DT
## id pat1 pat2 pat3
## 1: -0.6264538 0 1 1
## 2: 0.1836433 0 1 0
## 3: -0.8356286 1 1 1
## 4: 1.5952808 1 1 0
## 5: 0.3295078 1 1 1
如果它必须在一行上,则此循环执行此操作:
for (j in paste0('pat',1:3)) DT[get(j) > 0, (j) := 1L]
您也可以尝试set
indx <- grep('pat', names(DT))
for(j in indx){set(DT, i= which(DT[[j]] >0), j=j, value=1) }
DT
# id pat1 pat2 pat3
#1: -0.6264538 0 1 1
#2: 0.1836433 0 1 0
#3: -0.8356286 1 1 1
#4: 1.5952808 1 1 0
#5: 0.3295078 1 1 1
此外,正如@Frank所评论的,不需要"indx"对象,因为它可以在for
循环中使用。
尝试:
DT[, paste0("pat", 1:3) := lapply(.SD, function(x) as.integer(x > 0)),
.SDcols = paste0("pat", 1:3)]
id pat1 pat2 pat3
#1: -0.6264538 0 1 1
#2: 0.1836433 0 1 0
#3: -0.8356286 1 1 1
#4: 1.5952808 1 1 0
#5: 0.3295078 1 1 1