Pandas转换不均衡的时间序列数据

我有一些不规则标记的时间序列数据，带有时间戳和每个时间戳的观测值，在Panda中。不规则基本上意味着时间戳不均匀，例如两个连续时间戳之间的间隔不均匀。

例如，数据可能看起来像

    Timestamp     Property
    0                100
    1                200
    4                300
    6                400
    6                401
    7                500
    14               506
    24               550
           .....
    59               700
    61               750
    64               800

这里的时间戳是指从选定的起始时间起经过的秒数。正如您所看到的，我们可以在相同的时间戳中获得数据，在这种情况下为6秒。基本上，时间戳是严格不同的，只是第二个分辨率无法测量变化。

现在我需要提前移动时间序列数据，比如说我想将整个数据移动60秒，或者一分钟。所以目标输出是

  Timestamp     Property
    0                750
    1                800

因此，0分与61分相匹配，1分与64分相匹配。

现在我可以通过写一些肮脏的东西来做到这一点，但我希望尽可能多地使用任何内置的panda功能。如果时间序列是规则的，或者间隔均匀，我可以使用shift（）函数。但事实上，这一系列是不均衡的，使它有点棘手。欢迎熊猫专家提出任何想法。我觉得这将是一个常见的问题。非常感谢！

Edit：添加了第二种更优雅的方法。如果时间戳为1，两个时间戳为61，我不知道会发生什么。我认为它会选择前61个时间戳，但不确定。

new_stamps = pd.Series(range(df['Timestamp'].max()+1))
shifted = pd.DataFrame(new_stamps)
shifted.columns = ['Timestamp']
merged = pd.merge(df,shifted,on='Timestamp',how='outer')
merged['Timestamp'] = merged['Timestamp'] - 60
merged = merged.sort(columns = 'Timestamp').bfill()
results = pd.merge(df,merged, on = 'Timestamp')

【原帖】我想不出一种内在的或优雅的方式来做到这一点。发布这个以防它比你的"脏东西"更优雅，我想这是不可能的。怎么样：

lookup_dict = {}
def assigner(row):
    lookup_dict[row['Timestamp']] = row['Property']
df.apply(assigner, axis=1)
sorted_keys = sorted(lookup_dict.keys)
df['Property_Shifted'] = None
def get_shifted_property(row,shift_amt):
    for i in sorted_keys:
        if i >= row['Timestamp'] + shift_amt:
            row['Property_Shifted'] = lookup_dict[i]
    return row
df = df.apply(get_shifted_property, shift_amt=60, axis=1)

相关内容

最新更新

热门标签：