正在扩展datetime列以包括所有数据行-panda



我有一个数据集,它包含在42小时内模糊更新的预测数据。这是一个示例:

df_old = pd.DataFrame({'IssueDatetime': ['2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00','2010-01-01 09:00:00','2010-01-01 09:00:00'],
'endtime':['2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00'],
'Regions': ['EAST COAST-CAPE ST FRANCIS AND SOUTH', 'EAST COAST-CAPE ST FRANCIS AND SOUTH', 'EAST COAST-CAPE ST FRANCIS AND SOUTH','NORTHEAST COAST','NORTHEAST COAST'],
'forecastTime': ['2010-01-01 09:00:00','2010-01-01 15:00:00','2010-01-01 19:00:00','2010-01-01 09:00:00','2010-01-01 12:00:00'],
'forecast_Dir':[150,180,45,45,45],
'windSpeed':[20,90,35,45,15]})

问题是df[预测时间]和df[结束时间]之间的时间间隔。我试着利用我有限的熊猫知识对数据进行分组和重新采样,但由于日期重复,我无法获得日期时间索引。

最终,我的目标是扩展数据帧,使数据帧中原始小时之间的小时有自己的行,直到结束。。。

所需输出示例:

df_new = pd.DataFrame({'IssueDatetime': [ '2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00', '2010-01-01 09:00:00','2010-01-01 09:00:00'],
'endtime':['2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00','2010-01-03 03:00:00'],
'Regions': ['EAST COAST-CAPE ST FRANCIS AND SOUTH', 'EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH','EAST COAST-CAPE ST FRANCIS AND SOUTH'],
'forecastTime': ['2010-01-01 09:00:00','2010-01-01 10:00:00','2010-01-01 11:00:00','2010-01-01 12:00:00','2010-01-01 13:00:00','2010-01-01 14:00:00','2010-01-01 15:00:00'],
'forecast_Dir':[150,150,150,150,150,150,180],
'windSpeed':[20,20,20,20,20,20,90]})

请注意,对于第一个区域,df['refreshTime']='2010-01-01 09:00:00'和df['prefreshTime']='2010-01-01 15:00:00'之间的小时应该是各自的行。从本质上讲,我希望增加采样来填补缺失的时间。


编辑:-原始数据帧

IssueDatetime             endtime  
0     2013-01-01 09:00:00 2013-01-03 03:00:00   
1     2013-01-01 09:00:00 2013-01-03 03:00:00   
2     2013-01-01 09:00:00 2013-01-03 03:00:00   
3     2013-01-01 09:00:00 2013-01-03 03:00:00   
4     2013-01-01 09:00:00 2013-01-03 03:00:00   
...                   ...                 ...   
53585 2016-12-30 09:00:00 2017-01-01 03:00:00   
53586 2016-12-30 09:00:00 2017-01-01 03:00:00   
53587 2016-12-30 09:00:00 2017-01-01 03:00:00   
53588 2016-12-30 09:00:00 2017-01-01 03:00:00   
53589 2016-12-30 09:00:00 2017-01-01 03:00:00   
Regions        forecastTime  
0                                SOUTH COAST 2013-01-01 09:00:00   
1                                SOUTH COAST 2013-01-01 18:00:00   
2                                SOUTH COAST 2013-01-02 06:00:00   
3                                SOUTH COAST 2013-01-02 13:00:00   
4      EAST COAST-CAPE ST  FRANCIS AND SOUTH 2013-01-01 09:00:00   
...                                      ...                 ...   
53585               SOUTHWESTERN GRAND BANKS 2016-12-30 18:00:00   
53586               SOUTHWESTERN GRAND BANKS 2016-12-31 09:00:00   
53587               SOUTHWESTERN GRAND BANKS 2016-12-31 15:00:00   
53588               SOUTHWESTERN GRAND BANKS 2016-12-31 18:00:00   
53589               SOUTHWESTERN GRAND BANKS 2017-01-01 00:00:00   
forecastHour forecast_Dir forecast_WindSpeed_low  
0               0.0          270                     35   
1               9.0          270                     25   
2              21.0          225                     15   
3              28.0          270                     35   
4               0.0          270                     35   
...             ...          ...                    ...   
53585           9.0          135                     40   
53586          24.0          135                     40   
53587          30.0          135                     40   
53588          33.0          315                     25   
53589          39.0          315                     25   
forecast_WindSpeed_gust forecast_WindSpeed_high  
0                        None                    None   
1                        None                    None   
2                        None                    None   
3                        None                    None   
4                        None                    None   
...                       ...                     ...   
53585                    None                      50   
53586                    None                     50    
53587                    None                      50   
53588                    None                      35   
53589                    None                    None   
forecast_WindSpeed_exception_1_type forecast_Dir_exception_1  
0                                     NaN                      NaN   
1                                     NaN                      NaN   
2                                     NaN                      NaN   
3                                     NaN                      NaN   
4                                     NaN                      NaN   
...                                   ...                      ...   
53585                                 NaN                      NaN   
53586          OVER NORTHWESTERN SECTIONS                      315   
53587                                 NaN                      NaN   
53588                                 NaN                      NaN   
53589                                 NaN                      NaN   
forecast_WindSpeed_low_exception_1 forecast_WindSpeed_high_exception_1  
0                                    NaN                                 NaN  
1                                    NaN                                 NaN  
2                                    NaN                                 NaN  
3                                    NaN                                 NaN  
4                                    NaN                                 NaN  
...                                  ...                                 ...  
53585                                NaN                                 NaN  
53586                                 25                                None  
53587                                NaN                                 NaN  
53588                                NaN                                 NaN  
53589                                NaN                                 NaN  

IIUC,首先需要将'forecastTime'列转换为datetime,然后将'forecastTime'列设置为索引以进行重采样,然后按'Regions'分组,按小时进行resample,并用ffill:填充NaN值

df_old['forecastTime'] = pd.to_datetime(df_old['forecastTime'])
df_new = df_old.set_index('forecastTime')
.groupby('Regions',as_index=False)
.resample('H').ffill().droplevel(0).reset_index()
print(df_new.head())

为了避免由于日期重复而可能出现的错误(非唯一索引(,您可以尝试以下操作:

df_new = df_old.groupby('Regions',as_index=False)
.apply(lambda x: x.set_index('forecastTime')
.resample('H').ffill()).droplevel(0).reset_index()

输出:

forecastTime        IssueDatetime              endtime                               Regions  forecast_Dir  windSpeed
0 2010-01-01 09:00:00  2010-01-01 09:00:00  2010-01-03 03:00:00  EAST COAST-CAPE ST FRANCIS AND SOUTH           150         20
1 2010-01-01 10:00:00  2010-01-01 09:00:00  2010-01-03 03:00:00  EAST COAST-CAPE ST FRANCIS AND SOUTH           150         20
2 2010-01-01 11:00:00  2010-01-01 09:00:00  2010-01-03 03:00:00  EAST COAST-CAPE ST FRANCIS AND SOUTH           150         20
3 2010-01-01 12:00:00  2010-01-01 09:00:00  2010-01-03 03:00:00  EAST COAST-CAPE ST FRANCIS AND SOUTH           150         20
4 2010-01-01 13:00:00  2010-01-01 09:00:00  2010-01-03 03:00:00  EAST COAST-CAPE ST FRANCIS AND SOUTH           150         20

我已经部分回答了自己的问题。。。我仍然得到一个多索引错误,所以我在每个预测中添加了一毫秒每行的时间,使它们独一无二:

df_old['forecastTime'] = df_old['forecastTime'] + pd.to_timedelta(df_old.groupby('forecastTime').cumcount(), unit='ms')

df_old['forecastTime'] = pd.to_datetime(df_old['forecastTime'])
df_new = df_old.set_index('forecastTime')
.groupby('Regions',as_index=False)
.resample('H').ffill().droplevel(0).reset_index()

现在,我需要添加一个结束时间。有没有建议在每次预测的统计时间后42小时结束.fill((?

相关内容

  • 没有找到相关文章

最新更新