ValueError:尝试返回统计模型 MNLogit 置信区间时必须传递二维输入



以下代码应运行 MNLogit 模型并返回置信区间。它成功返回摘要,您可以在那里看到置信区间,但是当尝试通过 conf_int() 返回置信区间时,我收到一个 ValueError:必须传递 2-d 输入。

import pandas as pd
import statsmodels.api as sm
tmp = pd.read_csv('http://surveyanalysis.org/images/8/82/TrickedLogitMaxDiffExample.csv')
model = sm.MNLogit.from_formula('Choice ~ 1+B+C+D+E+F', tmp, missing='drop')
res = model.fit(method='ncg')
print(res.summary())
res.conf_int()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-255-a332500964e4> in <module>()
6 res = model.fit(method='ncg')
7 print(res.summary())
----> 8 res.conf_int()
~/anaconda/lib/python3.6/site-packages/statsmodels/base/wrapper.py in wrapper(self, *args, **kwargs)
93             obj = data.wrap_output(func(results, *args, **kwargs), how[0], how[1:])
94         elif how:
---> 95             obj = data.wrap_output(func(results, *args, **kwargs), how)
96         return obj
97 
~/anaconda/lib/python3.6/site-packages/statsmodels/base/data.py in wrap_output(self, obj, how, names)
405     def wrap_output(self, obj, how='columns', names=None):
406         if how == 'columns':
--> 407             return self.attach_columns(obj)
408         elif how == 'rows':
409             return self.attach_rows(obj)
~/anaconda/lib/python3.6/site-packages/statsmodels/base/data.py in attach_columns(self, result)
522             return Series(result, index=self.param_names)
523         else:  # for e.g., confidence intervals
--> 524             return DataFrame(result, index=self.param_names)
525 
526     def attach_columns_eq(self, result):
~/anaconda/lib/python3.6/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
304             else:
305                 mgr = self._init_ndarray(data, index, columns, dtype=dtype,
--> 306                                          copy=copy)
307         elif isinstance(data, (list, types.GeneratorType)):
308             if isinstance(data, types.GeneratorType):
~/anaconda/lib/python3.6/site-packages/pandas/core/frame.py in _init_ndarray(self, values, index, columns, dtype, copy)
461         # by definition an array here
462         # the dtypes will be coerced to a single dtype
--> 463         values = _prep_ndarray(values, copy=copy)
464 
465         if dtype is not None:
~/anaconda/lib/python3.6/site-packages/pandas/core/frame.py in _prep_ndarray(values, copy)
5686     return arrays, arr_columns
5687 
-> 5688 
5689 def _list_to_arrays(data, columns, coerce_float=False, dtype=None):
5690     if len(data) > 0 and isinstance(data[0], tuple):
ValueError: Must pass 2-d input

这是一个已经修复的错误,该错误将在里程碑 0.10 中公布:https://github.com/statsmodels/statsmodels/issues/3651#issuecomment-300511723

但是,有一个解决方法。您可以从结果对象获取置信区间的数组版本:

res = model.fit(method='ncg') res._results.conf_int()

这并不完全出色,因为您看不到哪个 CI 属于什么。但是,如果将其与 res.summary() 报告进行比较,您可以看到 CI 的顺序相同,这很有帮助。

感谢约瑟夫-pkt:https://github.com/statsmodels/statsmodels/issues/3883

最新更新