我有一个看起来像的数据帧
Hour Minute Second Value
0 14.0 57.0 17.0 0.0
1 14.0 57.0 18.0 0.0
2 14.0 57.0 19.0 138.6
3 14.0 57.0 20.0 138.6
4 14.0 57.0 21.0 138.6
5 14.0 57.0 22.0 138.6
我想将小时/分钟/秒列合并为时间戳索引。我有一个约会对象。我用df.apply
和datetime.datetime.combine(mydate, datetime.time(hour, min, sec))
成功地做到了这一点,但速度太慢了。
有没有一种方法可以有效地使用内置的panda功能?
Idea是Hour
和Minutes
的倍数,求和并添加to_datetime
:中的字符串日期时间
s = df['Hour'].mul(10000) + df['Minute'].mul(100) + df['Second']
df['date'] = pd.to_datetime('2015-01-01 ' + s.astype(str), format='%Y-%m-%d %H%M%S.%f')
print (df)
Hour Minute Second Value date
0 14.0 57.0 17.0 0.0 2015-01-01 14:57:17
1 14.0 57.0 18.0 0.0 2015-01-01 14:57:18
2 14.0 57.0 19.0 138.6 2015-01-01 14:57:19
3 14.0 57.0 20.0 138.6 2015-01-01 14:57:20
4 14.0 57.0 21.0 138.6 2015-01-01 14:57:21
5 14.0 57.0 22.0 138.6 2015-01-01 14:57:22
另一个选项是将Hour
和Minute
与各自的数字相乘,将总和转换为timedelta
并添加到日期:
mydate = pd.to_datetime('2020-02-05')
df['timestamp'] = pd.to_timedelta(df.Hour*3600+df.Minute*60+df.Second,
unit='sec').add(mydate)
输出:
Hour Minute Second Value timestamp
0 14.0 57.0 17.0 0.0 2020-02-05 14:57:17
1 14.0 57.0 18.0 0.0 2020-02-05 14:57:18
2 14.0 57.0 19.0 138.6 2020-02-05 14:57:19
3 14.0 57.0 20.0 138.6 2020-02-05 14:57:20
4 14.0 57.0 21.0 138.6 2020-02-05 14:57:21
5 14.0 57.0 22.0 138.6 2020-02-05 14:57:22
0 2020-02-05 14:57:17
1 2020-02-05 14:57:18
2 2020-02-05 14:57:19
3 2020-02-05 14:57:20
4 2020-02-05 14:57:21
5 2020-02-05 14:57:22
dtype: datetime64[ns]