如何对索引一半为时间序列,一半为整数的数据帧重新采样


close         0
2020-01-02 09:00:00+00:00  0.467291       NaN
2020-01-02 09:30:00+00:00  0.467267       NaN
2020-01-02 10:00:00+00:00  0.467729       NaN
2020-01-02 10:30:00+00:00  0.467923       NaN
2020-01-02 11:00:00+00:00  0.466707       NaN
...                             ...       ...
1500                            NaN  0.140868
1501                            NaN  0.136557
1502                            NaN  0.131828
1503                            NaN  0.128827
1504                            NaN  0.128978

考虑一下这个数据帧。有没有办法";ffilll";时间序列,所以它继续时间序列?

(注意,0列填充了闭合列"侧向"(。

close        
2020-01-02 09:00:00+00:00  0.467291       
2020-01-02 09:30:00+00:00  0.467267       
2020-01-02 10:00:00+00:00  0.467729       
2020-01-02 10:30:00+00:00  0.467923       
2020-01-02 11:00:00+00:00  0.466707       
...                             ...       
2020-17-02 09:30:00+00:00  0.161267       
2020-17-02 10:00:00+00:00  0.165729       
2020-17-02 10:30:00+00:00  0.164923       
2020-17-02 11:00:00+00:00  0.163707       

您可以将df拆分为2个数据帧。

如果在合并前您可以访问原始的2个数据帧,您可以立即使用它们。

然后你可以用你想要的日期重新索引第二个数据帧,并正确合并这两个数据帧:

last_ts = df['close'].last_valid_index()
df1 = df.loc[ : last_ts, ['close']]
df2 = df.iloc[len(df1) : , [1]]     # 1 is the index position of column 0
df2.index = pd.date_range(start = last_ts + pd.Timedelta('30 min'), 
periods = len(df2),
freq='30 min')
df2.columns = ['close']
result = pd.concat([df1, df2])

示例:

df = pd.DataFrame([[1, np.nan],
[2, np.nan],
[np.nan, 4]],
index = list(pd.date_range(start='2022', periods=2, freq='30 min')) + [1],
columns=['close', 0])
close    0
2022-01-01 00:00:00    1.0  NaN
2022-01-01 00:30:00    2.0  NaN
1                      NaN  4.0

结果:

close
2022-01-01 00:00:00    1.0
2022-01-01 00:30:00    2.0
2022-01-01 01:00:00    4.0

相关内容

最新更新