我如何解析Sklearns线性回归的日期值



我正在使用以下pandas dataframe index = groupedcrimes.index:

DatetimeIndex(['2014-06-30', '2014-07-31', '2014-08-31', '2014-09-30',
               '2014-10-31', '2014-11-30', '2014-12-31', '2015-01-31',
               '2015-02-28', '2015-03-31', '2015-04-30', '2015-05-31',
               '2015-06-30', '2015-07-31', '2015-08-31', '2015-09-30',
               '2015-10-31', '2015-11-30', '2015-12-31', '2016-01-31',
               '2016-02-29', '2016-03-31', '2016-04-30', '2016-05-31',
               '2016-06-30', '2016-07-31', '2016-08-31', '2016-09-30',
               '2016-10-31', '2016-11-30', '2016-12-31', '2017-01-31',
               '2017-02-28', '2017-03-31', '2017-04-30', '2017-05-31'],
              dtype='datetime64[ns]', name='Month', freq='M')

我正在从dateTime64 [ns]转换其类型

#I change the dates to be integers, I am not sure this is the best way    
groupedCrimes.index = pd.to_datetime(groupedCrimes.index)  
groupedCrimes.index = (groupedCrimes.index - groupedCrimes.index.min())  / np.timedelta64(1,'D')

这将其转换为以下内容:

[[0.00000000e+00]
 [3.58796296e-13]
 [7.17592593e-13]
 [1.06481481e-12]
 [1.42361111e-12]
 [1.77083333e-12]
 [2.12962963e-12]
 [2.48842593e-12]
 [2.81250000e-12]
 [3.17129630e-12]
 [3.51851852e-12]
 [3.87731481e-12]
 [4.22453704e-12]
 [4.58333333e-12]
 [4.94212963e-12]
 [5.28935185e-12]
 [5.64814815e-12]
 [5.99537037e-12]
 [6.35416667e-12]
 [6.71296296e-12]
 [7.04861111e-12]
 [7.40740741e-12]
 [7.75462963e-12]
 [8.11342593e-12]
 [8.46064815e-12]
 [8.81944444e-12]
 [9.17824074e-12]
 [9.52546296e-12]
 [9.88425926e-12]
 [1.02314815e-11]
 [1.05902778e-11]
 [1.09490741e-11]
 [1.12731481e-11]
 [1.16319444e-11]
 [1.19791667e-11]
 [1.23379630e-11]]

例如,我可以将这些值之一预测为日期:

[in] model.predict(3.58796296e-13)
[out] array([5990.81354452])

我怎么能:

  1. a(将这些数字转换回日期,以便我知道我是哪个日期预测。
  2. b(将来将日期转换为这种格式,以便我可以预测将来的日期?

i有一种更好的转换和处理日期的方法?

1970-01-01以来,只需将日期时间转换为#?

In [386]: df
Out[386]:
                 val
2014-06-30  0.156202
2014-07-31  0.416251
2014-08-31  0.649295
2014-09-30  0.402265
2014-10-31  0.983870
2014-11-30  0.773942
2014-12-31  0.327271
2015-01-31  0.813580
2015-02-28  0.292830
2015-03-31  0.848269
...              ...
2016-08-31  0.595301
2016-09-30  0.171903
2016-10-31  0.355610
2016-11-30  0.477474
2016-12-31  0.517182
2017-01-31  0.891583
2017-02-28  0.591066
2017-03-31  0.799293
2017-04-30  0.225473
2017-05-31  0.444644
[36 rows x 1 columns]
In [387]: df.index = (df.index - pd.to_datetime('1970-01-01')).days
In [388]: df
Out[388]:
            val
16251  0.156202
16282  0.416251
16313  0.649295
16343  0.402265
16374  0.983870
16404  0.773942
16435  0.327271
16466  0.813580
16494  0.292830
16525  0.848269
...         ...
17044  0.595301
17074  0.171903
17105  0.355610
17135  0.477474
17166  0.517182
17197  0.891583
17225  0.591066
17256  0.799293
17286  0.225473
17317  0.444644
[36 rows x 1 columns]

将其转换回:

In [392]: pd.to_datetime(df.index, unit='D')
Out[392]:
DatetimeIndex(['2014-06-30', '2014-07-31', '2014-08-31', '2014-09-30', '2014-10-31', '2014-11-30', '2014-12-31',
               '2015-01-31', '2015-02-28', '2015-03-31', '2015-04-30', '2015-05-31', '2015-06-30', '2015-07-31',
               '2015-08-31', '2015-09-30', '2015-10-31', '2015-11-30', '2015-12-31', '2016-01-31', '2016-02-29',
               '2016-03-31', '2016-04-30', '2016-05-31', '2016-06-30', '2016-07-31', '2016-08-31', '2016-09-30',
               '2016-10-31', '2016-11-30', '2016-12-31', '2017-01-31', '2017-02-28', '2017-03-31', '2017-04-30',
               '2017-05-31'],
              dtype='datetime64[ns]', freq=None)

最新更新