我有一个表(表1(,其中的相关系数对应于如下所述的变量:
var = c("A","B","C","D","E")
cor = c(0.7,0.3,0.5,0.1,0.9)
table1 = tibble(var,cor)
# A tibble: 5 × 2
var cor
<chr> <dbl>
1 A 0.7
2 B 0.3
3 C 0.5
4 D 0.1
5 E 0.9
我有一个感兴趣的矢量:
y=c(1,2,3,4)
以及如下所示的新表(表2(
A = c(1,2,NA,4)
B =c(5,6,7,8)
C=c(NA,10,11,12)
D=c(13,14,15,16)
table2 = tibble(A,B,C,D);table2
# A tibble: 4 × 4
A B C D
<dbl> <dbl> <dbl> <dbl>
1 1 5 NA 13
2 2 6 10 14
3 NA 7 11 15
4 4 8 12 16
我想计算向量y与表2的所有列的协方差,但仅当表1的相应相关性大于0.3并且如果小于0.3则返回0。因此,我想在表1中搜索相关性>0.3,即A和C(因为表2没有列e(。我如何使用base或dplyr包在R中实现这一点?
这应该适用于您:
library(dplyr)
library(tidyr)
var = c("A","B","C","D","E")
cor = c(0.7,0.3,0.5,0.1,0.9)
table1 = tibble(var,cor)
A = c(1,2,NA,4)
B =c(5,6,7,8)
C=c(NA,10,11,12)
D=c(13,14,15,16)
table2 = tibble(A,B,C,D)
table2
y=c(1,2,3,4)
table3 <- table2 %>%
summarise(across(.cols = everything(), cov, y = y, use = "complete.obs")) %>%
pivot_longer(cols = everything(), names_to = "var", values_to = "covar") %>%
merge(table1) %>%
filter(cor > 0.3)