Pandas:使用跳转生成顺序时间戳



我有一个df,索引如下

df.index
>>> [2010-01-04 10:00:00, ..., 2010-12-31 16:00:00]

主列为volume

在时间戳序列中,不存在周末和其他一些工作日。我想重新采样我的时间索引,得到每分钟的总体积。所以我做了如下操作:

df = df.resample('60S', how=sum)

少了几分钟。换句话说,有几分钟没有交易。我想把这些缺失的时间包括进去,在volume列上加一个0。为了解决这个问题,我通常会这样做:

new_range = pd.date_range('20110104 09:30:00','20111231 16:00:00',
                          freq='60s')+df.index
df = df.reindex(new_range)
df = df.between_time(start_time='10:00', end_time='16:00') # time interval per day that I want
df = df.fillna(0)

但是现在我被不想要的约会困住了,比如周末和其他一些日子。如何删除原来不在时间戳索引中的日期?

只需构造您想要的日期时间范围并重新索引它。

整个范围

In [9]: rng = pd.date_range('20130101 09:00','20130110 16:00',freq='30T')
In [10]: rng
Out[10]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:00:00, ..., 2013-01-10 16:00:00]
Length: 447, Freq: 30T, Timezone: None

排除超出范围的时间

In [11]: rng = rng.take(rng.indexer_between_time('09:30','16:00'))
In [12]: rng
Out[12]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:30:00, ..., 2013-01-10 16:00:00]
Length: 140, Freq: None, Timezone: None

消除non-weekdays

In [13]: rng = rng[rng.weekday<5]
In [14]: rng
Out[14]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:30:00, ..., 2013-01-10 16:00:00]
Length: 112, Freq: None, Timezone: None

仅看值,您可能想要df.reindex(index=rng)

In [15]: rng.to_series()
Out[15]: 
2013-01-01 09:30:00   2013-01-01 09:30:00
2013-01-01 10:00:00   2013-01-01 10:00:00
2013-01-01 10:30:00   2013-01-01 10:30:00
2013-01-01 11:00:00   2013-01-01 11:00:00
2013-01-01 11:30:00   2013-01-01 11:30:00
2013-01-01 12:00:00   2013-01-01 12:00:00
2013-01-01 12:30:00   2013-01-01 12:30:00
2013-01-01 13:00:00   2013-01-01 13:00:00
2013-01-01 13:30:00   2013-01-01 13:30:00
2013-01-01 14:00:00   2013-01-01 14:00:00
2013-01-01 14:30:00   2013-01-01 14:30:00
2013-01-01 15:00:00   2013-01-01 15:00:00
2013-01-01 15:30:00   2013-01-01 15:30:00
2013-01-01 16:00:00   2013-01-01 16:00:00
2013-01-02 09:30:00   2013-01-02 09:30:00
...
2013-01-09 16:00:00   2013-01-09 16:00:00
2013-01-10 09:30:00   2013-01-10 09:30:00
2013-01-10 10:00:00   2013-01-10 10:00:00
2013-01-10 10:30:00   2013-01-10 10:30:00
2013-01-10 11:00:00   2013-01-10 11:00:00
2013-01-10 11:30:00   2013-01-10 11:30:00
2013-01-10 12:00:00   2013-01-10 12:00:00
2013-01-10 12:30:00   2013-01-10 12:30:00
2013-01-10 13:00:00   2013-01-10 13:00:00
2013-01-10 13:30:00   2013-01-10 13:30:00
2013-01-10 14:00:00   2013-01-10 14:00:00
2013-01-10 14:30:00   2013-01-10 14:30:00
2013-01-10 15:00:00   2013-01-10 15:00:00
2013-01-10 15:30:00   2013-01-10 15:30:00
2013-01-10 16:00:00   2013-01-10 16:00:00
Length: 112

您还可以从构造的工作日频率系列开始(如果您想要假期,可以添加自定义工作日,这是0.14.0中的新功能,请参阅此处

最新更新