我正试图在我的数据集上使用CoxTimeVaringFitter,但baseline_cumulative_hazard_中似乎存在类型问题。
我试图减少单个功能以隔离问题,但无法适应下面的数据集。问题是我的数据还是模型?谢谢
代码:
from lifelines import CoxTimeVaryingFitter
import autograd.numpy as np
ctv = CoxTimeVaryingFitter()
comp = 'comp_comp1' #start with comp1
event = 'failure_'+comp.split("_")[1]
cols = ['start', 'stop',
'machineID',
'age',
event,
'volt_24_ma','rotate_24_ma', 'vibration_24_ma', 'pressure_24_ma'
]
ctv.fit(df_X_train[cols].dropna(),
id_col='machineID',
event_col=event,
start_col='start',
stop_col='stop',
show_progress=True,
fit_options={'step_size':0.25})
ctv.print_summary()
ctv.plot()
数据类型
时间序列数据
错误:ufunc的循环不支持numpy.float64类型的参数0,该参数没有可调用的exp方法
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
AttributeError: 'numpy.float64' object has no attribute 'exp'
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
<command-1950361299690996> in <module>
23 ]
24
---> 25 ctv.fit(df_X_train[cols].dropna(),
26 id_col='machineID',
27 event_col=event,
/databricks/python/lib/python3.8/site-packages/lifelines/fitters/cox_time_varying_fitter.py in fit(self, df, event_col, start_col, stop_col, weights_col, id_col, show_progress, robust, strata, initial_point, formula, fit_options)
237 self.confidence_intervals_ = self._compute_confidence_intervals()
238 self.baseline_cumulative_hazard_ = self._compute_cumulative_baseline_hazard(df, events, start, stop, weights)
--> 239 self.baseline_survival_ = self._compute_baseline_survival()
240 self.event_observed = events
241 self.start_stop_and_events = pd.DataFrame({"event": events, "start": start, "stop": stop})
/databricks/python/lib/python3.8/site-packages/lifelines/fitters/cox_time_varying_fitter.py in _compute_baseline_survival(self)
815
816 def _compute_baseline_survival(self):
--> 817 survival_df = np.exp(-self.baseline_cumulative_hazard_)
818 survival_df.columns = ["baseline survival"]
819 return survival_df
/databricks/python/lib/python3.8/site-packages/pandas/core/generic.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
1934 self, ufunc: Callable, method: str, *inputs: Any, **kwargs: Any
1935 ):
-> 1936 return arraylike.array_ufunc(self, ufunc, method, *inputs, **kwargs)
1937
1938 # ideally we would define this to avoid the getattr checks, but
/databricks/python/lib/python3.8/site-packages/pandas/core/arraylike.py in array_ufunc(self, ufunc, method, *inputs, **kwargs)
364 # take this path if there are no kwargs
365 mgr = inputs[0]._mgr
--> 366 result = mgr.apply(getattr(ufunc, method))
367 else:
368 # otherwise specific ufunc methods (eg np.<ufunc>.accumulate(..))
/databricks/python/lib/python3.8/site-packages/pandas/core/internals/managers.py in apply(self, f, align_keys, ignore_failures, **kwargs)
423 try:
424 if callable(f):
--> 425 applied = b.apply(f, **kwargs)
426 else:
427 applied = getattr(b, f)(**kwargs)
/databricks/python/lib/python3.8/site-packages/pandas/core/internals/blocks.py in apply(self, func, **kwargs)
376 """
377 with np.errstate(all="ignore"):
--> 378 result = func(self.values, **kwargs)
379
380 return self._split_op_result(result)
TypeError: loop of ufunc does not support argument 0 of type numpy.float64 which has no callable exp method
看起来它正在尝试将np.exp
应用于具有object
数据类型的数据帧(或系列或数组(。
从另一个问题我有一个简单的熊猫系列:
In [120]: a
Out[120]:
0 1
1 3
2 5
3 7
4 9
dtype: int64
有了int
dtype,我可以应用np.exp
并获得浮点dtypes系列:
In [121]: np.exp(a)
Out[121]:
0 2.718282
1 20.085537
2 148.413159
3 1096.633158
4 8103.083928
dtype: float64
但如果我将系列转换为object
数据类型,我会得到您的错误:
In [122]: np.exp(a.astype(object))
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
AttributeError: 'int' object has no attribute 'exp'
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
Input In [122], in <cell line: 1>()
----> 1 np.exp(a.astype(object))
File ~anaconda3libsite-packagespandascoregeneric.py:2101, in NDFrame.__array_ufunc__(self, ufunc, method, *inputs, **kwargs)
2097 @final
2098 def __array_ufunc__(
2099 self, ufunc: np.ufunc, method: str, *inputs: Any, **kwargs: Any
2100 ):
-> 2101 return arraylike.array_ufunc(self, ufunc, method, *inputs, **kwargs)
File ~anaconda3libsite-packagespandascorearraylike.py:397, in array_ufunc(self, ufunc, method, *inputs, **kwargs)
394 elif self.ndim == 1:
395 # ufunc(series, ...)
396 inputs = tuple(extract_array(x, extract_numpy=True) for x in inputs)
--> 397 result = getattr(ufunc, method)(*inputs, **kwargs)
398 else:
399 # ufunc(dataframe)
400 if method == "__call__" and not kwargs:
401 # for np.<ufunc>(..) calls
402 # kwargs cannot necessarily be handled block-by-block, so only
403 # take this path if there are no kwargs
TypeError: loop of ufunc does not support argument 0 of type int which has no callable exp method
如果a
是一个数据帧而不是一个系列,那么回溯将更加接近。