熊猫系列按具体时间分组



我正在寻找在气体日分组熊猫系列的有效实现(与天然气交易有关)。这包括在CET时区6点到第二天6点之间的所有小时/时间戳。由于实行夏令时,一年一次燃气日有23小时,一次燃气日有25小时。我目前的解决方案还可以(参见下面的to_gas_day函数),但是非常慢。如有任何意见,欢迎指教。

import pandas as pd
def to_gas_day(stamp):
"""Take a time stamp and return date according to gas day (from 6 to 6 CET)."""
if stamp.hour < 6:
day = stamp.date() - pd.Timedelta(days=1)
else:
day = stamp.date()
return pd.to_datetime(day)
se = pd.Series(
data = 1.,
index=pd.date_range('2020-10-23','2020-10-27', freq='H', tz='CET')[:-1]
)
# This is expected count of hours around DST date
se.groupby(to_gas_day).count()
Out[107]: 
2020-10-22     6
2020-10-23    24
2020-10-24    25
2020-10-25    24
2020-10-26    18
dtype: int64

这是否等同于您的代码?

obj = pd.Series(pd.date_range('2020-10-23','2020-10-27', freq='H', tz='CET')[:-1])
cond = obj.dt.hour < 6
obj2 = np.where(cond, 
obj.dt.date - pd.Timedelta(days=1),
obj.dt.date)
obj2 = pd.Series(obj2)
obj3 = obj2.value_counts().sort_index()
print(obj3)
2020-10-22     6
2020-10-23    24
2020-10-24    25
2020-10-25    24
2020-10-26    18
dtype: int64

最新更新