将列从一个数据帧复制到另一个数据框架,并基于日期时间索引对其进行联接



我想做的事情:

我需要根据日期将第二个数据帧中的值附加到第一个数据帧,并使用panda上的ffill函数向前填充。然而,我找不到基于DateTime索引合并数据的方法,因为索引的格式不同。

我有一个包含5分钟时间序列数据(df(的原始数据帧,我想在其上添加一个名为pivot的列,该列的值是根据每日时间帧(day_df(计算的

原始df:

Open    High    Low Close   Volume  Symbol
Date                        
2022-08-23 07:00:00 21255.3 21278.8 20911.0 20955.1 44548.837   BTCUSDT
2022-08-23 08:00:00 20955.2 21078.4 20874.8 21068.9 34545.958   BTCUSDT
2022-08-23 09:00:00 21069.0 21300.0 21057.0 21297.3 34938.504   BTCUSDT
2022-08-23 10:00:00 21297.2 21511.5 21297.2 21444.2 55520.111   BTCUSDT
2022-08-23 11:00:00 21444.3 21546.5 21279.0 21430.2 47976.229   BTCUSDT

包含名为pivot的列的每日df,我需要将其加入较小的时间框架df:

Open    High    Low Close   pivot
Datetime                    
2022-09-26  18803.900391    19274.873047    18721.285156    19221.037109    18843.519531
2022-09-27  19221.839844    20338.455078    18915.667969    19105.250000    19271.420573
2022-09-28  19101.373047    19688.343750    18553.296875    19427.175781    19164.024740
2022-09-29  19431.128906    19587.783203    18924.353516    19566.023438    19468.287760
2022-09-30  19573.224609    20109.314453    19265.662109    19431.542969    19535.820964

我的方法:我想过做pd.merge,但对ffill来说,它似乎没有相应的枢轴值:

df.merge(day_df['pivot], how = 'cross')

任何帮助或更好的解决方案都将不胜感激。

我的完整代码:

import yfinance as yf
import numpy as np
import pandas as pd
import datetime
df = yf.download(tickers='BTC-USD', period = '30d', interval = '30m')
df = df.reset_index()
#resetting df to start at midnight of next day
min_date = df.Datetime.min()
NextDay_Date = (min_date + datetime.timedelta(days=1)).replace(hour=0, minute=0, second=0, microsecond=0)
df = df[df.Datetime >= NextDay_Date].copy()
df = df.set_index('Datetime')
day_df = (df.resample('D')
.agg({'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}))
day_df['pivot'] = (df['High']+ df['Low'] + df['Close'])/3

如果将来有人遇到与我相同的问题,下面是我如何解决的:

我通过将00:00:00附加到每日时间帧day_dfdf中,并尝试通过转换为日期时间格式来使用merge,暂时解决了这个问题。即通过稍微修改CCD_ 6并将HH:MM:SS附加到它。然后,使用fillna正向填充连接的df的nan值。

day_df = day_df.reset_index()
day_df['Datetime'] = day_df['Datetime'].dt.strftime('%Y-%m-%d %H:%M:%S')
day_df['Datetime'] = pd.to_datetime(day_df['Datetime'])
day_df = day_df.set_index('Datetime')
day_df.drop(['Open', 'High', 'Low', 'Close'], axis=1, inplace=True)
#merging both dataframes
df = df.join(day_df)
df['pivot'].fillna(method='ffill', inplace=True)

以下是完整的代码:

import yfinance as yf
import numpy as np
import pandas as pd
import datetime
df = yf.download(tickers='BTC-USD', period = '30d', interval = '30m')
df = df.reset_index()
#resetting df to start at midnight of next day
min_date = df.Datetime.min()
NextDay_Date = (min_date + datetime.timedelta(days=1)).replace(hour=0, minute=0, second=0, microsecond=0)
df = df[df.Datetime >= NextDay_Date].copy()
df = df.set_index('Datetime')
# resampled daily pivots merged to smaller timeframe
day_df = (df.resample('D')
.agg({'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'}))
day_df['pivot'] = (df['High']+ df['Low'] + df['Close'])/3
day_df = day_df.reset_index()
day_df['Datetime'] = day_df['Datetime'].dt.strftime('%Y-%m-%d %H:%M:%S')
day_df['Datetime'] = pd.to_datetime(day_df['Datetime'])
day_df = day_df.set_index('Datetime')
day_df.drop(['Open', 'High', 'Low', 'Close'], axis=1, inplace=True)
#merging both dataframes
df = df.join(day_df)
df['pivot'].fillna(method='ffill', inplace=True)

最新更新