如何根据另一列找到时间差异



我有一个像这样的数据集

user-id      date-time                   msg
  1          2016-12-09 10:25:00          1
  2          2016-12-09 10:26:00          0
  3          2016-12-09 10:26:00          1
  2          2016-12-09 10:27:00          1
  1          2016-12-09 10:28:00          2
  2          2016-12-09 10:28:00          1
  3          2016-12-09 10:29:00          2
  2          2016-12-09 10:29:00          1
  1          2016-12-09 10:30:00          3

我想拥有一个新的列来计算每个记录与消息与该记录相似的第一次时差。这样的东西:

 user-id      date-time                  msg        time-diffrence
  1          2016-12-09 10:25:00          1            00:00
  2          2016-12-09 10:26:00          0            00:00
  3          2016-12-09 10:26:00          1            01:00
  2          2016-12-09 10:27:00          1            02:00
  1          2016-12-09 10:28:00          2            00:00
  2          2016-12-09 10:28:00          1            03:00
  3          2016-12-09 10:29:00          2            01:00
  2          2016-12-09 10:29:00          1            04:00
  1          2016-12-09 10:30:00          3            00:00

我找到了仅考虑日期时间或使用LOC或ILOC的解决方案,但它们不起作用。

选项#1

使用groupbyiloc

df['time-difference'] = df.groupby('msg')['date-time'].apply(lambda x: x - x.iloc[0])

输出:

   user-id           date-time  msg time-difference
0        1 2016-12-09 10:25:00    1        00:00:00
1        2 2016-12-09 10:26:00    0        00:00:00
2        3 2016-12-09 10:26:00    1        00:01:00
3        2 2016-12-09 10:27:00    1        00:02:00
4        1 2016-12-09 10:28:00    2        00:00:00
5        2 2016-12-09 10:28:00    1        00:03:00
6        3 2016-12-09 10:29:00    2        00:01:00
7        2 2016-12-09 10:29:00    1        00:04:00
8        1 2016-12-09 10:30:00    3        00:00:00

选项#2

transformfirstmin使用groupby

df['time-difference'] = df['date-time'] - df.groupby('msg')['date-time'].transform('first')

输出:

   user-id           date-time  msg time-difference
0        1 2016-12-09 10:25:00    1        00:00:00
1        2 2016-12-09 10:26:00    0        00:00:00
2        3 2016-12-09 10:26:00    1        00:01:00
3        2 2016-12-09 10:27:00    1        00:02:00
4        1 2016-12-09 10:28:00    2        00:00:00
5        2 2016-12-09 10:28:00    1        00:03:00
6        3 2016-12-09 10:29:00    2        00:01:00
7        2 2016-12-09 10:29:00    1        00:04:00
8        1 2016-12-09 10:30:00    3        00:00:00

最新更新