pandas在linux上出错,但在windows上工作



所以,我运行了我的"ml";在我本地的windows机器上运行模型,一切都很顺利,只需要48小时就可以完全运行每个进程,自然我会要求公司提供更多的处理能力来缩短时间,他们给了我一个linux模拟服务器来运行我的模型,但出于某种原因,熊猫给了我下一个错误:

Traceback (most recent call last):
File "/ANACONDATA/Prediccion_de_Fallas/03_Modelos_y_Scripts/testv14.py", line 894, in <module>
dlist[xx][namer2] = np.where((dlist[xx].too_soon == 0),dlist[xx][column].shift(24) , 0)
File "/opt/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py", line 3643, in __setitem__
self._setitem_array(key, value)
File "/opt/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py", line 3702, in _setitem_array
self._iset_not_inplace(key, value)
File "/opt/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py", line 3721, in _iset_not_inplace
raise ValueError("Columns must be same length as key")

这是我失败的代码(在windows上运行正常(,尝试使用pandas 1.3.5和1.4.2,结果相同

features=['AN_Utility_Supply_Press','AN_LPC_ASV_Position',
'AN_Eng_Brg_3Y_Gap',... 200 something list of features]


dlist = {}
turbo= np.unique(dfx2['SAP'])
for xx in (turbo):
dlist[xx]=dfx2.loc[(dfx2['SAP'] == xx)]

for column in dlist[xx][features]:
namer2=[column+'_'+'Lag']
fails here------>dlist[xx][namer2] = np.where((dlist[xx].too_soon == 0),dlist[xx][column].shift(24) , 0)
#        namer3=[column+'_'+'Lchg'+"24"]
#        dlist[xx][namer3] = np.where((dlist[xx].too_soon == 0),(dlist[xx][column]-dlist[xx][column].shift(24)) , 0)
namer4=[column+'_'+'mean']
dlist[xx][namer4] = np.where((dlist[xx].too_soon == 0),(dlist[xx][column].rolling(min_periods=1, window=feature_window).mean()),  dlist[xx][column])        
namer5=[column+'_'+'max']
dlist[xx][namer5] = np.where((dlist[xx].too_soon == 0),(dlist[xx][column].rolling(min_periods=1, window=feature_window).max()),  dlist[xx][column])         


dfx2 = pd.concat(dlist)
dfx2.reset_index(drop=True)
dfx2=dfx2.droplevel(level=0)

我是不是错过了什么?,为什么会发生这种情况?

好的,花了一些时间才弄清楚,我尝试了更多版本的panda,直到它起作用,版本是1.2.4,对发生的事情没有真正的解释。

相关内容

最新更新