我想做的事情:
我需要根据日期将第二个数据帧中的值附加到第一个数据帧,并使用panda上的ffill
函数向前填充。然而,我找不到基于DateTime索引合并数据的方法,因为索引的格式不同。
我有一个包含5分钟时间序列数据(df(的原始数据帧,我想在其上添加一个名为pivot
的列,该列的值是根据每日时间帧(day_df(计算的
原始df:
Open High Low Close Volume Symbol
Date
2022-08-23 07:00:00 21255.3 21278.8 20911.0 20955.1 44548.837 BTCUSDT
2022-08-23 08:00:00 20955.2 21078.4 20874.8 21068.9 34545.958 BTCUSDT
2022-08-23 09:00:00 21069.0 21300.0 21057.0 21297.3 34938.504 BTCUSDT
2022-08-23 10:00:00 21297.2 21511.5 21297.2 21444.2 55520.111 BTCUSDT
2022-08-23 11:00:00 21444.3 21546.5 21279.0 21430.2 47976.229 BTCUSDT
包含名为pivot的列的每日df,我需要将其加入较小的时间框架df:
Open High Low Close pivot
Datetime
2022-09-26 18803.900391 19274.873047 18721.285156 19221.037109 18843.519531
2022-09-27 19221.839844 20338.455078 18915.667969 19105.250000 19271.420573
2022-09-28 19101.373047 19688.343750 18553.296875 19427.175781 19164.024740
2022-09-29 19431.128906 19587.783203 18924.353516 19566.023438 19468.287760
2022-09-30 19573.224609 20109.314453 19265.662109 19431.542969 19535.820964
我的方法:我想过做pd.merge
,但对ffill
来说,它似乎没有相应的枢轴值:
df.merge(day_df['pivot], how = 'cross')
任何帮助或更好的解决方案都将不胜感激。
我的完整代码:
import yfinance as yf
import numpy as np
import pandas as pd
import datetime
df = yf.download(tickers='BTC-USD', period = '30d', interval = '30m')
df = df.reset_index()
#resetting df to start at midnight of next day
min_date = df.Datetime.min()
NextDay_Date = (min_date + datetime.timedelta(days=1)).replace(hour=0, minute=0, second=0, microsecond=0)
df = df[df.Datetime >= NextDay_Date].copy()
df = df.set_index('Datetime')
day_df = (df.resample('D')
.agg({'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}))
day_df['pivot'] = (df['High']+ df['Low'] + df['Close'])/3
如果将来有人遇到与我相同的问题,下面是我如何解决的:
我通过将00:00:00附加到每日时间帧day_df
df中,并尝试通过转换为日期时间格式来使用merge,暂时解决了这个问题。即通过稍微修改CCD_ 6并将HH:MM:SS附加到它。然后,使用fillna
正向填充连接的df的nan值。
day_df = day_df.reset_index()
day_df['Datetime'] = day_df['Datetime'].dt.strftime('%Y-%m-%d %H:%M:%S')
day_df['Datetime'] = pd.to_datetime(day_df['Datetime'])
day_df = day_df.set_index('Datetime')
day_df.drop(['Open', 'High', 'Low', 'Close'], axis=1, inplace=True)
#merging both dataframes
df = df.join(day_df)
df['pivot'].fillna(method='ffill', inplace=True)
以下是完整的代码:
import yfinance as yf
import numpy as np
import pandas as pd
import datetime
df = yf.download(tickers='BTC-USD', period = '30d', interval = '30m')
df = df.reset_index()
#resetting df to start at midnight of next day
min_date = df.Datetime.min()
NextDay_Date = (min_date + datetime.timedelta(days=1)).replace(hour=0, minute=0, second=0, microsecond=0)
df = df[df.Datetime >= NextDay_Date].copy()
df = df.set_index('Datetime')
# resampled daily pivots merged to smaller timeframe
day_df = (df.resample('D')
.agg({'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}))
day_df['pivot'] = (df['High']+ df['Low'] + df['Close'])/3
day_df = day_df.reset_index()
day_df['Datetime'] = day_df['Datetime'].dt.strftime('%Y-%m-%d %H:%M:%S')
day_df['Datetime'] = pd.to_datetime(day_df['Datetime'])
day_df = day_df.set_index('Datetime')
day_df.drop(['Open', 'High', 'Low', 'Close'], axis=1, inplace=True)
#merging both dataframes
df = df.join(day_df)
df['pivot'].fillna(method='ffill', inplace=True)