谢谢你看我的问题。我试图从外汇市场的时间序列数据中删除周末和假日。我用过pd。bdate_range,但我不确定如何在Primary book中实现它。如果你需要更多的信息,请联系我。
Thanks for the help
Now = today.replace( second = 0, microsecond =0)
st = (today-timedelta(days = 14))
et=today
#Remove weekend
br= pd.bdate_range(st, et)
#Remove holidays
#Only time betweeen 7am to 5pm
PrimaryBook = _get_tsdb_primary_prices("audusd", st,et).fillna(method="ffill")
PrimaryBook = PrimaryBook.dropna(axis=1, how='all')
PrimaryBook = PrimaryBook.dropna()
PrimaryBookB = PrimaryBook.filter(regex=r'(BID|BSIZ)')
PrimaryBookA = PrimaryBook.filter(regex=r'(ASK|ASIZ)')
PrimaryBookB = PrimaryBook.filter(regex=r'(BID|BSIZ)')
PrimaryBookZ= PrimaryBook.filter(regex=r'(ASK|ASIZ)')
PrimaryBookZ
这是结果,但我想删除周末,假期,并有一个时间框架早上7点到下午5点
BEST_ASK1 BEST_ASIZ1 BEST_ASK2 BEST_ASIZ2 BEST_ASK3 BEST_ASIZ3 BEST_ASK4 BEST_ASIZ4 BEST_ASK5 BEST_ASIZ5
Time
2021-07-22 08:41:36.625573856+00:00 0.73725 2000000.0 0.73730 6000000.0 0.73735 4000000.0 0.73740 5000000.0 0.73745 4000000.0
2021-07-22 08:41:36.630647614+00:00 0.73725 2000000.0 0.73730 6000000.0 0.73735 4000000.0 0.73740 5000000.0 0.73745 4000000.0
2021-07-22 08:41:36.635475238+00:00 0.73725 1000000.0 0.73730 6000000.0 0.73735 4000000.0 0.73740 5000000.0 0.73745 4000000.0
2021-07-22 08:41:36.640455282+00:00 0.73725 2000000.0 0.73730 6000000.0 0.73735 4000000.0 0.73740 5000000.0 0.73745 4000000.0
2021-07-22 08:41:36.660516225+00:00 0.73725 2000000.0 0.73730 6000000.0 0.73735 4000000.0 0.73740 5000000.0 0.73745 5000000.0
... ... ... ... ... ... ... ... ... ... ...
2021-08-05 08:41:29.025629378+00:00 0.73990 6000000.0 0.73995 4000000.0 0.74000 5000000.0 0.74005 5000000.0 0.74010 9000000.0
2021-08-05 08:41:29.450549198+00:00 0.73990 6000000.0 0.73995 4000000.0 0.74000 5000000.0 0.74005 5000000.0 0.74010 7000000.0
2021-08-05 08:41:30.346124376+00:00 0.73990 6000000.0 0.73995 4000000.0 0.74000 5000000.0 0.74005 5000000.0 0.74010 7000000.0
2021-08-05 08:41:31.586653810+00:00 0.73990 6000000.0 0.73995 4000000.0 0.74000 5000000.0 0.74005 5000000.0 0.74010 7000000.0
2021-08-05 08:41:31.840526198+00:00 0.73990 6000000.0 0.73995 4000000.0 0.74000 5000000.0 0.74005 5000000.0 0.74010 7000000.0
我真的很感激你的帮助
由于假期与国家和年份有关,因此您需要使用套餐。
建议使用holidays
:
import holidays
for day in holidays.UnitedStates(years=2021).items():
print(day)
将为您提供相应年份中所有假日的datetime对象列表:
(datetime.date(2021, 1, 1), "New Year's Day")
(datetime.date(2021, 12, 31), "New Year's Day (Observed)")
(datetime.date(2021, 1, 18), 'Martin Luther King Jr. Day')
(datetime.date(2021, 2, 15), "Washington's Birthday")
...
下一步将把您的日期转换为相同的格式:
import pandas as pd
df = pd.DataFrame([{"id":1, "day":"2021-07-22 08:41:36.625573856+00:00"}, {"id":1, "day":"2021-12-31 08:41:36.625573856+00:00"}])
df.day = pd.to_datetime(df.day)
之后很容易比较这一天是否包含在假期列表中:
df.loc[:,"isholiday"] = df.apply(lambda x: x.day.date() in [d[0] for d in holidays.UnitedStates(years=2021).items()], axis=1)
df
id day isholiday
0 1 2021-07-22 08:41:36.625573856+00:00 False
1 1 2021-12-31 08:41:36.625573856+00:00 True
周末也是一样通过检查dt是否。Dayofweek属性在[5,6](零索引天数)
我重置了索引,然后使用了dt。Dayofweek <5删除周末