我有一个df,索引如下
df.index
>>> [2010-01-04 10:00:00, ..., 2010-12-31 16:00:00]
主列为volume
。
在时间戳序列中,不存在周末和其他一些工作日。我想重新采样我的时间索引,得到每分钟的总体积。所以我做了如下操作:
df = df.resample('60S', how=sum)
少了几分钟。换句话说,有几分钟没有交易。我想把这些缺失的时间包括进去,在volume
列上加一个0。为了解决这个问题,我通常会这样做:
new_range = pd.date_range('20110104 09:30:00','20111231 16:00:00',
freq='60s')+df.index
df = df.reindex(new_range)
df = df.between_time(start_time='10:00', end_time='16:00') # time interval per day that I want
df = df.fillna(0)
但是现在我被不想要的约会困住了,比如周末和其他一些日子。如何删除原来不在时间戳索引中的日期?
只需构造您想要的日期时间范围并重新索引它。
整个范围In [9]: rng = pd.date_range('20130101 09:00','20130110 16:00',freq='30T')
In [10]: rng
Out[10]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:00:00, ..., 2013-01-10 16:00:00]
Length: 447, Freq: 30T, Timezone: None
排除超出范围的时间
In [11]: rng = rng.take(rng.indexer_between_time('09:30','16:00'))
In [12]: rng
Out[12]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:30:00, ..., 2013-01-10 16:00:00]
Length: 140, Freq: None, Timezone: None
消除non-weekdays
In [13]: rng = rng[rng.weekday<5]
In [14]: rng
Out[14]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 09:30:00, ..., 2013-01-10 16:00:00]
Length: 112, Freq: None, Timezone: None
仅看值,您可能想要df.reindex(index=rng)
In [15]: rng.to_series()
Out[15]:
2013-01-01 09:30:00 2013-01-01 09:30:00
2013-01-01 10:00:00 2013-01-01 10:00:00
2013-01-01 10:30:00 2013-01-01 10:30:00
2013-01-01 11:00:00 2013-01-01 11:00:00
2013-01-01 11:30:00 2013-01-01 11:30:00
2013-01-01 12:00:00 2013-01-01 12:00:00
2013-01-01 12:30:00 2013-01-01 12:30:00
2013-01-01 13:00:00 2013-01-01 13:00:00
2013-01-01 13:30:00 2013-01-01 13:30:00
2013-01-01 14:00:00 2013-01-01 14:00:00
2013-01-01 14:30:00 2013-01-01 14:30:00
2013-01-01 15:00:00 2013-01-01 15:00:00
2013-01-01 15:30:00 2013-01-01 15:30:00
2013-01-01 16:00:00 2013-01-01 16:00:00
2013-01-02 09:30:00 2013-01-02 09:30:00
...
2013-01-09 16:00:00 2013-01-09 16:00:00
2013-01-10 09:30:00 2013-01-10 09:30:00
2013-01-10 10:00:00 2013-01-10 10:00:00
2013-01-10 10:30:00 2013-01-10 10:30:00
2013-01-10 11:00:00 2013-01-10 11:00:00
2013-01-10 11:30:00 2013-01-10 11:30:00
2013-01-10 12:00:00 2013-01-10 12:00:00
2013-01-10 12:30:00 2013-01-10 12:30:00
2013-01-10 13:00:00 2013-01-10 13:00:00
2013-01-10 13:30:00 2013-01-10 13:30:00
2013-01-10 14:00:00 2013-01-10 14:00:00
2013-01-10 14:30:00 2013-01-10 14:30:00
2013-01-10 15:00:00 2013-01-10 15:00:00
2013-01-10 15:30:00 2013-01-10 15:30:00
2013-01-10 16:00:00 2013-01-10 16:00:00
Length: 112
您还可以从构造的工作日频率系列开始(如果您想要假期,可以添加自定义工作日,这是0.14.0中的新功能,请参阅此处