如何在pandas中将三列合并为一个时间戳列



我有一个看起来像的数据帧

Hour    Minute  Second  Value
0   14.0    57.0    17.0    0.0
1   14.0    57.0    18.0    0.0
2   14.0    57.0    19.0    138.6
3   14.0    57.0    20.0    138.6
4   14.0    57.0    21.0    138.6
5   14.0    57.0    22.0    138.6

我想将小时/分钟/秒列合并为时间戳索引。我有一个约会对象。我用df.applydatetime.datetime.combine(mydate, datetime.time(hour, min, sec))成功地做到了这一点,但速度太慢了。

有没有一种方法可以有效地使用内置的panda功能?

Idea是HourMinutes的倍数,求和并添加to_datetime:中的字符串日期时间

s = df['Hour'].mul(10000) + df['Minute'].mul(100) + df['Second']
df['date'] = pd.to_datetime('2015-01-01 ' + s.astype(str), format='%Y-%m-%d %H%M%S.%f')
print (df)
Hour  Minute  Second  Value                date
0  14.0    57.0    17.0    0.0 2015-01-01 14:57:17
1  14.0    57.0    18.0    0.0 2015-01-01 14:57:18
2  14.0    57.0    19.0  138.6 2015-01-01 14:57:19
3  14.0    57.0    20.0  138.6 2015-01-01 14:57:20
4  14.0    57.0    21.0  138.6 2015-01-01 14:57:21
5  14.0    57.0    22.0  138.6 2015-01-01 14:57:22

另一个选项是将HourMinute与各自的数字相乘,将总和转换为timedelta并添加到日期:

mydate = pd.to_datetime('2020-02-05')
df['timestamp'] = pd.to_timedelta(df.Hour*3600+df.Minute*60+df.Second, 
unit='sec').add(mydate)

输出:

Hour  Minute  Second  Value           timestamp
0  14.0    57.0    17.0    0.0 2020-02-05 14:57:17
1  14.0    57.0    18.0    0.0 2020-02-05 14:57:18
2  14.0    57.0    19.0  138.6 2020-02-05 14:57:19
3  14.0    57.0    20.0  138.6 2020-02-05 14:57:20
4  14.0    57.0    21.0  138.6 2020-02-05 14:57:21
5  14.0    57.0    22.0  138.6 2020-02-05 14:57:22
0   2020-02-05 14:57:17
1   2020-02-05 14:57:18
2   2020-02-05 14:57:19
3   2020-02-05 14:57:20
4   2020-02-05 14:57:21
5   2020-02-05 14:57:22
dtype: datetime64[ns]

最新更新