我有一个pandas数据帧df_phys
,其结构如下例所示。我想将所有信号(在这种情况下为EngineSpeed
和WheelBasedVehicleSpeed
(重新采样到1秒频率。我可以通过以下逐个信号的环路来做到这一点:
for signal, group in df_phys.groupby("Signal")["Physical Value"]:
df_signal = group.to_frame()
df_signal = df_signal.resample("1S").pad().dropna()
然而,我更喜欢对整个数据帧重新采样(Physical Value
是应该重新采样的值(,而不是在逐个信号的循环中这样做。这有可能实现吗?
df_phys数据结构(预重采样(
TimeStamp,CAN ID,PGN,Source Address,Signal,Raw Value,Physical Value
2020-01-13 14:47:09.816750+00:00,217056256,61444,0,EngineSpeed,13596,1699.5
2020-01-13 14:47:09.826750+00:00,217056256,61444,0,EngineSpeed,13554,1694.25
2020-01-13 14:47:09.836800+00:00,217056256,61444,0,EngineSpeed,13495,1686.875
2020-01-13 14:47:09.846750+00:00,217056256,61444,0,EngineSpeed,13418,1677.25
2020-01-13 14:47:09.856850+00:00,217056256,61444,0,EngineSpeed,13331,1666.375
2020-01-13 14:47:09.867200+00:00,217056256,61444,0,EngineSpeed,13228,1653.5
2020-01-13 14:47:09.868950+00:00,419361024,65265,0,WheelBasedVehicleSpeed,4864,19.0
2020-01-13 14:47:09.876850+00:00,217056256,61444,0,EngineSpeed,13117,1639.625
2020-01-13 14:47:09.886800+00:00,217056256,61444,0,EngineSpeed,13004,1625.5
2020-01-13 14:47:09.896800+00:00,217056256,61444,0,EngineSpeed,12893,1611.625
2020-01-13 14:47:09.907300+00:00,217056256,61444,0,EngineSpeed,12814,1601.75
2020-01-13 14:47:09.916750+00:00,217056256,61444,0,EngineSpeed,12730,1591.25
2020-01-13 14:47:09.926750+00:00,217056256,61444,0,EngineSpeed,12663,1582.875
2020-01-13 14:47:09.936850+00:00,217056256,61444,0,EngineSpeed,12605,1575.625
2020-01-13 14:47:09.946800+00:00,217056256,61444,0,EngineSpeed,12552,1569.0
2020-01-13 14:47:09.956900+00:00,217056256,61444,0,EngineSpeed,12506,1563.25
2020-01-13 14:47:09.967300+00:00,217056256,61444,0,EngineSpeed,12452,1556.5
2020-01-13 14:47:09.969000+00:00,419361024,65265,0,WheelBasedVehicleSpeed,4946,19.3203125
2020-01-13 14:47:09.976850+00:00,217056256,61444,0,EngineSpeed,12393,1549.125
2020-01-13 14:47:09.986850+00:00,217056256,61444,0,EngineSpeed,12356,1544.5
2020-01-13 14:47:09.996800+00:00,217056256,61444,0,EngineSpeed,12303,1537.875
2020-01-13 14:47:10.007900+00:00,217056256,61444,0,EngineSpeed,12241,1530.125
2020-01-13 14:47:10.017200+00:00,217056256,61444,0,EngineSpeed,12187,1523.375
2020-01-13 14:47:10.026900+00:00,217056256,61444,0,EngineSpeed,12147,1518.375
2020-01-13 14:47:10.036900+00:00,217056256,61444,0,EngineSpeed,12095,1511.875
2020-01-13 14:47:10.046800+00:00,217056256,61444,0,EngineSpeed,12040,1505.0
2020-01-13 14:47:10.056950+00:00,217056256,61444,0,EngineSpeed,11983,1497.875
2020-01-13 14:47:10.067400+00:00,217056256,61444,0,EngineSpeed,11937,1492.125
2020-01-13 14:47:10.069150+00:00,419361024,65265,0,WheelBasedVehicleSpeed,5222,20.3984375
您似乎正在尝试使用groupby
对数据进行重新采样。如果是这样的话,当将TimeStamp
设置为索引时,您可以简单地对groupby
对象重新采样:
import pandas as pd
df_phys = pd.read_csv("test.txt", sep = ",", parse_dates=["TimeStamp"], index_col="TimeStamp")
df_res = df_phys.groupby("Signal").resample("1S").mean()
print(df_res)
样本输出:
CAN ID ... Physical Value
Signal TimeStamp ...
EngineSpeed 2020-01-13 14:47:09+00:00 217056256.0 ... 1611.907895
2020-01-13 14:47:10+00:00 217056256.0 ... 1511.250000
WheelBasedVehicleSpeed 2020-01-13 14:47:09+00:00 419361024.0 ... 19.160156
2020-01-13 14:47:10+00:00 419361024.0 ... 20.398438
[4 rows x 5 columns]