我有以下的决策规则:
RELIABILITY LEVEL DESCRIPTION
LEVEL I Multiple regression
LEVEL II Multiple regression + mechanisms specified (all interest variables)
LEVEL III Multiple regression + mechanisms specified (all interest + control vars)
前三列是应该使用dplyr复制第四列的数据。
整个表(模型)的可靠性水平应该是相同的…我想用dplyr编码它。
这是我迄今为止的尝试…正如你所看到的,我不能让整个模型都一样
library(tidyverse)
library(readxl)
library(effectsize)
df <- read_excel("https://github.com/timverlaan/relia/blob/59d2cbc5d7830c41542c5f65449d5f324d6013ad/relia.xlsx")
df1 <- df %>%
group_by(study, table, function_var) %>%
mutate(count_vars = n()) %>%
ungroup %>%
group_by(study, table, function_var, mechanism_described) %>%
mutate(count_int = case_when(
function_var == 'interest' & mechanism_described == 'yes' ~ n()
)) %>%
mutate(count_con = case_when(
function_var == 'control' & mechanism_described == 'yes' ~ n()
)) %>%
mutate(reliable_int = case_when(
function_var == 'interest' & count_vars/count_int == 1 ~ 1)) %>%
mutate(reliable_con = case_when(
function_var == 'control' & count_vars/count_con == 1 ~ 1)) %>%
# group_by(study, source) %>%
mutate(reliable = case_when(
reliable_int != 1 ~ 1,
reliable_int == 1 ~ 2,
reliable_int + reliable_con == 2 ~ 3)) %>%
# ungroup() %>%
选定的代码为:
library(tidyverse)
library(readxl)
df <- read_excel("C:/Users/relia.xlxs")
df <- df %>% select(-reliability_score)
test<-df %>% group_by(study,model,function_var) %>%
summarise(count_yes=sum(mechanism_described=="yes"),n=n(),frac=count_yes/n) %>%
mutate(frac_control=frac[function_var=="control"],
frac_interest=frac[function_var=="interest"]) %>%
mutate(reliability = case_when(
frac_control == 1 & frac_interest != 1 ~ -99,
frac_control != 1 & frac_interest != 1 ~ 2,
frac_interest == 1 & frac_control != 1 ~ 3,
frac_interest ==1 & frac_control == 1 ~ 4)) %>% group_by(study,model) %>% summarise(reliability=mean(reliability))
df_reliability<-left_join(df,test)
View(df_reliability)
然而,我更愿意在一个dplyr管道中完成这一切。如果有人有解决办法,我很乐意听听……