R因子的Logistic回归误差

我正在尝试使用以下代码进行逻辑回归：

model <- glm (Participation ~ Gender + Race + Ethnicity + Education + Comorbidities + WLProgram + LoseWeight + EverLoseWeight + PastYearLW + Age + BMI, data = LogisticData, family = binomial)

摘要(模型(

我一直收到错误：

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :  contrasts can be applied only to factors with 2 or more levels

在查看论坛后，我查看了哪些变量是因素：

str(LogisticData)
'data.frame':   994 obs. of  13 variables:
$ outcome       : Factor w/ 2 levels "No","Yes": 1 1 2 2 1 2 2 1 2 2 ...
$ Gender        : Factor w/ 3 levels "Male","Female",..: 1 2 2 1 2 1 1 1 1 
$ Race          : Factor w/ 3 levels "White","Black",..: 1 1 1 3 1 1 1 1 1 1 
$ Ethnicity     : Factor w/ 2 levels "Hispanic/Latino",..: 2 2 2 2 2 2 2 2 2 
$ Education     : Factor w/ 2 levels "Below Bachelors",..: 1 1 1 2 1 1 1 2 1 
$ Comorbidities : Factor w/ 2 levels "No","Yes": 1 1 2 1 1 1 2 2 1 1 ...
$ WLProgram     : Factor w/ 2 levels "No","Yes": NA 1 2 2 1 1 1 NA 1 1 ...
$ LoseWeight    : Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 1 1 ...
$ PastYearLW    : Factor w/ 2 levels "Yes","No": NA 2 1 1 1 2 1 NA 1 1 ...
$ EverLoseWeight: Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 1 1 ...
$ Age           : int  29 35 69 32 21 45 40 62 59 58 ...
$ Participation : Factor w/ 2 levels "Yes","No": 2 2 1 1 1 1 1 2 1 2 ...
$ BMI           : num  25.7 33.8 26.4 32.3 27.5 ...

所有因素似乎都有2个或多个级别。

我还试图省略NA，这仍然给了我这个错误。

我想要回归中的所有变量，但不知道为什么它不会运行。

执行时：

newdata <- droplevels(na.omit(LogisticData))
> str(newdata)
'data.frame':   840 obs. of  13 variables:
$ outcome       : Factor w/ 2 levels "No","Yes": 1 2 2 1 2 2 2 2 2 2 ...
$ Gender        : Factor w/ 3 levels "Male","Female",..: 2 2 1 2 1 1 1 2 1 
$ Race          : Factor w/ 3 levels "White","Black",..: 1 1 3 1 1 1 1 1 3 
$ Ethnicity     : Factor w/ 2 levels "Hispanic/Latino",..: 2 2 2 2 2 2 2 2 
$ Education     : Factor w/ 2 levels "Below Bachelors",..: 1 1 2 1 1 1 1 1 
$ Comorbidities : Factor w/ 2 levels "No","Yes": 1 2 1 1 1 2 1 1 1 2 ...
$ WLProgram     : Factor w/ 2 levels "No","Yes": 1 2 2 1 1 1 1 1 1 1 ...
$ LoseWeight    : Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ PastYearLW    : Factor w/ 2 levels "Yes","No": 2 1 1 1 2 1 1 1 1 2 ...
$ EverLoseWeight: Factor w/ 1 level "Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ Age           : int  35 69 32 21 45 40 59 58 23 32 ...
$ Participation : Factor w/ 2 levels "Yes","No": 2 1 1 1 1 1 1 2 2 1 ...
$ BMI           : num  33.8 26.4 32.3 27.5 45.4 ...
- attr(*, "na.action")=Class 'omit'  Named int [1:154] 1 8 13 14 21 24 25 
46 55 58 ...
.. ..- attr(*, "names")= chr [1:154] "1" "8" "13" "14" ...

这对我来说没有意义，因为你可以在第一个str(Logisitic Data(中看到，EverLoseWeight中显然有两个级别，你可以看到Yes和No以及1和2？如何修复此异常？

考虑到您的更新，看起来您至少有两种可能性。

1：去除NA后只剩下一个级别的因素(即LoseWeight和EverLoseWeight(。

2：将NA视为一个额外的级别。类似的东西

a = as.factor(c(1,1,NA,2))
b = as.factor(c(1,1,2,1))
# 0 is an unused factor level for a
x = data.frame(a, b)
levels(x$a) = c(levels(x$a), 0)
x$a[is.na(x$a)] = 0

但这可能无法处理任何奇异性问题，这些问题也导致了单级因素的存在。

尝试对原始数据执行summary，并确保所有级别都有值。我会把这句话写在评论中，但我没有信誉点：(

相关内容

最新更新

热门标签：