Сorrect时间间隔索引



我是熊猫新手,正在尝试聚合。我将Dataframe转换为日期格式,并每天更改索引。


model['time_only'] = [time.time() for time in model['date']]
model['date_only'] = [date.date() for date in model['date']]
model['cumsum'] = ((model['date_only'].diff() == datetime.timedelta(days=1))*1).cumsum()
def get_out_of_market_data(data):
df = data.copy()

start_market_time = datetime.time(hour=13,minute=30)
end_market_time = datetime.time(hour=20,minute=0)
df['time_only'] = [time.time() for time in df['date']]
df['date_only'] = [date.date() for date in df['date']]
cond = (start_market_time > df['time_only']) | (df['time_only'] >= end_market_time)
return data[cond]

model['date'] = pd.to_datetime(model['date'])
new = model.drop(columns=['time_only', 'date_only'])
get_out_of_market_data(data=new).head(20)

我得到了什么

0    0   65.5000 65.50   65.5000 65.500  DD  1   125 65.500000   2016-01-04 13:15:00 0
26   26  62.7438 62.96   62.6600 62.956  DD  1639    174595  62.781548   2016-01-04 20:00:00 0
27   27  62.5900 62.79   62.5300 62.747  DD  2113    268680  62.650260   2016-01-04 20:15:00 0
28   28  62.7950 62.80   62.5400 62.590  DD  2652    340801  62.652640   2016-01-04 20:30:00 0
29   29  63.1000 63.12   62.7800 62.800  DD  6284    725952  62.963512   2016-01-04 20:45:00 0
30   30  63.2200 63.22   63.0700 63.080  DD  21  699881  63.070114   2016-01-04 21:00:00 0
31   31  63.2200 63.22   63.2200 63.220  DD  7   1973    63.220000   2016-01-04 22:00:00 0
32   32  63.4000 63.40   63.4000 63.400  DD  2   150 63.400000   2016-01-05 00:30:00 1
33   33  62.3700 62.37   62.3700 62.370  DD  3   350 62.370000   2016-01-05 11:00:00 1
34   34  62.1000 62.37   62.1000 62.370  DD  2   300 62.280000   2016-01-05 11:15:00 1
35   35  62.0800 62.08   62.0800 62.080  DD  1   100 62.080000   2016-01-05 11:45:00 1

后两列为20:00 - 13:30的时间间隔,分别为每天的变化指数和当天的变化指数

我尝试按最后一列分组,从一天的20:00到第二天的13:00,通过团购索引每个间隔我不完全明白方法,但举例来说new.groupby(pd.Grouper(freq='17hours'))

如何将索引移动到这个间隔?

您可以尝试创建一个新列来表示市场日它属于。如果时间小于13:30:00,则为昨天的集日,否则为今天的集日。然后你可以按它分组。代码将是:

def get_market_day(dt):
if dt.time() < datetime.time(13, 30, 0):
return dt.date() - datetime.timedelta(days=1)
else:
return dt.date()

df["market_day"] = df["dt"].map(get_market_day)
df.groupby("market_day").agg(...)

相关内容

  • 没有找到相关文章

最新更新