删除pandas中以分钟和秒为单位的十进制值



我有一个数据帧作为

dict1={'time' : ['2 min 19 sec','2 min 43 sec','1 min 33 sec','32 sec','40 sec','22 sec','2.3 
sec','3.2 min 13 sec','4.9 min 7.6 sec']}
df=pd.DataFrame(dict1)
df

time
0   2 min 19 sec
1   2 min 43 sec
2   1 min 33 sec
3   32 sec
4   40 sec
5   22 sec
6   2.3 sec
7   3.2 min 13 sec
8   4.9 min 7.6 sec

我需要生成如下输出,这样min的十进制值应该添加到sec,sec的十进制值将被删除

time
0   2 min 19 sec
1   2 min 43 sec
2   1 min 33 sec
3   0 min 32 sec
4   0 min 40 sec
5   0 min 22 sec
6   0 min 2 sec
7   3 min 15 sec
8   4 min 16 sec

尝试使用带有正则表达式模式的Series.str.extract来提取数值。

将分钟的小数部分添加到秒,然后使用列表理解来格式化所需的结果:

vals = df['time'].str.extract('^(?:(S+?) min )?(S+?) sec').fillna(0).astype(float)
vals[1] += vals[0].mod(1).mul(10)
df['time_corrected'] = [f'{int(m)} min {int(s)} sec' for m, s in vals.apply(tuple, axis=1)]

[out]

time time_corrected
0     2 min 19 sec   2 min 19 sec
1     2 min 43 sec   2 min 43 sec
2     1 min 33 sec   1 min 33 sec
3           32 sec   0 min 32 sec
4           40 sec   0 min 40 sec
5           22 sec   0 min 22 sec
6          2.3 sec    0 min 2 sec
7   3.2 min 13 sec   3 min 15 sec
8  4.9 min 7.6 sec   4 min 16 sec

最新更新