因此,我弹出了一个问题,要求我在逻辑回归模型中根据特定结果的比例数据和特定的伪变量输入(如性别(生成一个反应变量(反应需要是二进制的,简单的是或否(,这将给我YES
发生的概率。
数据看起来有点像(这不是确切的数据,我只是根据原始数据的布局拼凑而成(:
是 | 否 | 总计 | ummy1>|||||
---|---|---|---|---|---|---|---|
5 | 30 | 35 | <1>>td style="ext-align:center;">13 | ||||
6 | 7 | 1 | <1>|||||
4 | 20 | 24 | 25 | 129 | <154>2 | ||
2 | |||||||
13 | 42 | >td style="text align:central;">651 | <1>2 |
#Preparing data:
df1 <- df %>%
pivot_longer(
cols= c(Yes, No),
names_to = "response_name",
values_to = "response_value"
) %>%
mutate(response_name = case_when(response_name == "Yes" ~ "1",
response_name == "No" ~ "0"),
response_name = as.numeric(response_name))
xtabs(response_value ~ ., df1)
fit <- glm(response_name ~ `dummy1(1,2)`, weights = response_value, data = df1, family = binomial)
summary(fit)
输出:
Call:
glm(formula = response_name ~ `dummy1(1,2)`, family = binomial,
data = df1, weights = response_value)
Deviance Residuals:
Min 1Q Median 3Q Max
-6.7736 -3.6590 0.9414 4.0899 9.5249
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.7461 0.5084 -1.468 0.142
`dummy1(1,2)` -0.4453 0.3091 -1.441 0.150
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 272.12 on 9 degrees of freedom
Residual deviance: 270.07 on 8 degrees of freedom
AIC: 274.07
Number of Fisher Scoring iterations: 5
数据:
df <- tibble::tribble(
~Yes, ~No, ~Total, ~`dummy1(1,2)`, ~`dummy2(1,2)`, ~`dummy3(1,2,3)`,
5L, 30L, 35L, 1L, 2L, 3L,
6L, 7L, 13L, 1L, 1L, 1L,
4L, 20L, 24L, 2L, 2L, 3L,
25L, 129L, 154L, 2L, 1L, 2L,
13L, 42L, 65L, 1L, 1L, 2L
)