我有一个数据帧。。。
example <- data.frame(obs_val= c(20,15,3,7,5), patient = c("pt1","pt2","pt3","pt4","pt5"))
其中每一行或";病人";是一个独特的观察。
我的目标是生成一个数据帧,从另一个患者的obs_val
中减去每个患者的观察值(obs_val
)。这种减法将是一种置换,其中即pt1
不具有从其自身减去的它们自己的obs_val
。理想情况下,最终的数据帧应该如下所示:
pt1-pt2 pt1-pt3 pt1-pt4 pt1-pt5 pt2-pt3 pt2-pt4 ...
obs_val_diff 5 17 13 15 12 8 ...
关于解决这个问题或重新格式化最终数据帧,有什么建议吗?
另一个选项是使用combn
获取所有组合,然后映射减法。
library(tidyverse)
data.frame(t(combn(example$patient, 2))) |>
mutate(obs_val_diff = map2_dbl(X1, X2, ~example[example$patient ==.x, "obs_val"] -
example[example$patient ==.y, "obs_val"])) |>
unite(test, X1, X2, sep = "-") |>
pivot_wider(names_from = test, values_from = obs_val_diff)
#> # A tibble: 1 x 10
#> `pt1-pt2` `pt1-pt3` `pt1-pt4` pt1-pt~1 pt2-p~2 pt2-p~3 pt2-p~4 pt3-p~5 pt3-p~6
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 5 17 13 15 12 8 10 -4 -2
#> # ... with 1 more variable: `pt4-pt5` <dbl>, and abbreviated variable names
#> # 1: `pt1-pt5`, 2: `pt2-pt3`, 3: `pt2-pt4`, 4: `pt2-pt5`, 5: `pt3-pt4`,
#> # 6: `pt3-pt5`
或在碱基R:中
apply(t(combn(example$patient, 2)), 1,
(x) -diff(example[example$patient %in% x, "obs_val"])) |>
((v) matrix(v, ncol = length(v)))() |>
as.data.frame() |>
`colnames<-`(apply(t(combn(example$patient, 2)), 1,
(x) paste(x, collapse = "-")))
#> pt1-pt2 pt1-pt3 pt1-pt4 pt1-pt5 pt2-pt3 pt2-pt4 pt2-pt5 pt3-pt4 pt3-pt5
#> 1 5 17 13 15 12 8 10 -4 -2
#> pt4-pt5
#> 1 2
您只需将数据帧连接到自身上,删除患者与自身匹配的行,并保留差异:
library(data.table)
library(magrittr)
setDT(example)
example[,id:=1][example, on=.(id), allow.cartesian=T] %>%
.[patient!=i.patient] %>%
.[, .(p1 = i.patient, p2=patient, p1_minus_p2=i.obs_val-obs_val)]
输出:
p1 p2 p1_minus_p2
1: pt1 pt2 5
2: pt1 pt3 17
3: pt1 pt4 13
4: pt1 pt5 15
5: pt2 pt1 -5
6: pt2 pt3 12
7: pt2 pt4 8
8: pt2 pt5 10
9: pt3 pt1 -17
10: pt3 pt2 -12
11: pt3 pt4 -4
12: pt3 pt5 -2
13: pt4 pt1 -13
14: pt4 pt2 -8
15: pt4 pt3 4
16: pt4 pt5 2
17: pt5 pt1 -15
18: pt5 pt2 -10
19: pt5 pt3 2
20: pt5 pt4 -2