r语言 - 将具有比例的宽表转换为具有比例和置信区间的表

  • 本文关键字:区间 r语言 转换 r data-analysis
  • 更新时间 :
  • 英文 :


现在的表格如下:

10 0002 .77点 0034.3498150357.548

下面的代码生成一些类似于您自己的数据,估计比例和置信区间,然后生成一个表,其中比例点估计和区间位于同一单元格中。

生成示例数据

library(dplyr)
library(tibble)
library(purrr)
library(tidyr)
set.seed(123)
# Generate example data
N <- 100
df <- tibble(edu = sample(1:3, N, replace = TRUE), 
region = sample(c("north","south"), N, replace = TRUE),
outcome = sample(0:1, N, replace = TRUE))
df %>% head(10)
#> # A tibble: 10 x 3
#>      edu region outcome
#>    <int> <chr>    <int>
#>  1     3 north        1
#>  2     3 south        1
#>  3     3 south        0
#>  4     2 north        0
#>  5     3 north        1
#>  6     2 north        0
#>  7     2 south        0
#>  8     2 north        1
#>  9     3 south        1
#> 10     1 south        0

定义辅助函数prop.test.info()

# Function that takes a 1-row data.frame, conducts a one-group test, and
# extracts inferential quantities in a data.frame
prop.test.info <- function(df_row) {
# Conduct one-sample t-test
result <- df_row %>%
prop.test(
x = .$successes,
p = .$h0,
n = .$sample_size,
alternative = "two.sided",
conf.level = 0.95
)

# Return CIs in a data.frame
data.frame(
ci_low = result$conf.int[1],
ci_high =  result$conf.int[2],
prop = result$estimate
)
}

估算data.frame的比例和纠偏

# Calculate sample sizes and cumulative successes
prop_df <- df %>% group_by(edu, region) %>%
summarize(
sample_size = n(),
successes = sum(outcome)
) %>%
ungroup()
# Add column for null hypothesis of 0.5
prop_df <- prop_df %>% mutate(h0 = 0.5, id = row_number()) 
# Conduct tests, add inferential quantities, round values
out <- prop_df %>%
split(.$id) %>%
map( ~ prop.test.info(.x) %>%
bind_cols(.x)) %>%
bind_rows %>%
mutate(across(where(is.numeric), ~ round(.x, 2)))

创建单元格值

# Add combine CI and point estimate character variable, drop all variables
# not needed for table
tabvars <- out %>%
mutate(est = paste0(prop, " (", ci_low, " - ", ci_high, ")")) %>%
select(edu, region, est)
<标题>生产表
tabvars %>%
pivot_wider(names_from = region, values_from = est)
#> # A tibble: 3 x 3
#>     edu north              south             
#>   <dbl> <chr>              <chr>             
#> 1     1 0.31 (0.12 - 0.59) 0.41 (0.19 - 0.67)
#> 2     2 0.36 (0.12 - 0.68) 0.43 (0.23 - 0.66)
#> 3     3 0.59 (0.33 - 0.81) 0.56 (0.31 - 0.78)

相关内容

  • 没有找到相关文章

最新更新