如何制作三元组列并对其应用函数



我有一个带有mycols的数据帧。您可以看到 mycols 有 ^SQN^WES^WGS 列集。我想在mycols中以相同的顺序制作SQNWESWGS 列的三元组(我不想包含 (Pile:up)$:AD$列((您可以看到每个集合的 SQN、WES 和 WGS 的扩展名相同(。换句话说,我想制作一组具有相同扩展名的 SQN、WES 和 WGS。然后我有一个名为 myfunc 的函数。我想将该函数应用于由此形成的每个三元组。

mycols<- c("SQN:IDH2:G515T:R172M","WES:IDH2:G515T:R172M"    ,"WES:IDH2:G515T:R172M:AD:(Pile:up)", "WGS:IDH2:G515T:R172M","SQN:JAK1:A1432T:T478S",   "WES:JAK1:A1432T:T478S" ,"WES:JAK1:A1432T:T478S:AD:(pile:up)","WGS:JAK1:A1432T:T478S","SQN:JAK1:T1868C:V623A","WES:JAK1:T1868C:V623A","WES:JAK1:T1868C:V623A:AD","WES:JAK1:T1868C:V623A:AD:(Pile:up)",  "WGS:JAK1:T1868C:V623A")

结果:

triplet1
"SQN:IDH2:G515T:R172M",   "WES:IDH2:G515T:R172M", "WGS:IDH2:G515T:R172M" 
triplet2
"SQN:JAK1:A1432T:T478S","WES:JAK1:A1432T:T478S","WGS:JAK1:A1432T:T478S",
triplet3
"SQN:JAK1:T1868C:V623A","WES:JAK1:T1868C:V623A","WGS:JAK1:T1868C:V623A"

所以我可以简单地将我的函数调用到 triplet1、triple 2、triplet3...

我们可以得到没有 'P(p(ile:up' 或 'AD'(在末尾(的字符串的逻辑索引,带有 grepl . 用"i1"子集"mycols"。 使用 sub 创建分组变量,方法是删除以字母字符(包括第一个:(开头的前缀部分,然后split"mycols1"。

i1 <- !grepl('(?i)(P)ile|AD$', mycols)
mycols1 <- mycols[i1]
split(mycols1, sub('[^:]+:', '', mycols1))
#$`IDH2:G515T:R172M`
#[1] "SQN:IDH2:G515T:R172M" "WES:IDH2:G515T:R172M" "WGS:IDH2:G515T:R172M"
#$`JAK1:A1432T:T478S`
#[1] "SQN:JAK1:A1432T:T478S" "WES:JAK1:A1432T:T478S" "WGS:JAK1:A1432T:T478S"
#$`JAK1:T1868C:V623A`
#[1] "SQN:JAK1:T1868C:V623A" "WES:JAK1:T1868C:V623A" "WGS:JAK1:T1868C:V623A"

最新更新