Pandas通过几分钟将TimeZone Aware DateTime字段比较



我有一个类似的 datetime对象:

df.iloc[:10]
0   2019-03-05 00:45:36.503277422+08:00
1   2019-03-05 00:46:36.404034571+08:00
2   2019-03-05 00:47:36.434888822+08:00
3   2019-03-05 00:48:36.535496247+08:00
4   2019-03-05 00:49:36.512082457+08:00
5   2019-03-05 00:50:36.515718466+08:00
6   2019-03-05 00:51:36.520325894+08:00
7   2019-03-05 00:52:36.523945647+08:00
8   2019-03-05 00:53:36.548567617+08:00
9   2019-03-05 00:54:36.740268213+08:00
Name: Date-Time, dtype: datetime64[ns, Asia/Shanghai]

我将检索所有行的时间比08:00:00 Asia/Shanghai时间晚,这意味着比00:00:00 UTC时间晚。我有两个问题:

  1. 如何在当地时间(上海(而不是UTC时间编写条件。仅df[df>'2019-03-05 00:00:00']返回True。如果我使用df[df>'2019-03-05 08:00:00'],它将全部为False

  2. 如何仅使用时间而不是必须在时间之前预定日期。我不想写df[df>'2019-03-05 00:00:00'],而是只写df[df>'00:00:00']时间。

非常感谢!

您可以将时区信息添加到标量dateTime并进行比较:

date = pd.to_datetime('2015-02-24').tz_localize('UTC').tz_convert('Asia/Shanghai')
print (date)
2015-02-24 08:00:00+08:00

或:

date = pd.Timestamp('2015-02-24 08:00:00+08:00')

print (df[df > date])
0   2019-03-05 00:45:36.503277422+08:00
1   2019-03-05 00:46:36.404034571+08:00
2   2019-03-05 00:47:36.434888822+08:00
3   2019-03-05 00:48:36.535496247+08:00
4   2019-03-05 00:49:36.512082457+08:00
5   2019-03-05 00:50:36.515718466+08:00
6   2019-03-05 00:51:36.520325894+08:00
7   2019-03-05 00:52:36.523945647+08:00
8   2019-03-05 00:53:36.548567617+08:00
9   2019-03-05 00:54:36.740268213+08:00
Name: Date-Time, dtype: datetime64[ns, Asia/Shanghai]

和第二个比较时间:

from datetime import time
print (df[df.dt.time > time(0,0,0)])
0   2019-03-05 00:45:36.503277422+08:00
1   2019-03-05 00:46:36.404034571+08:00
2   2019-03-05 00:47:36.434888822+08:00
3   2019-03-05 00:48:36.535496247+08:00
4   2019-03-05 00:49:36.512082457+08:00
5   2019-03-05 00:50:36.515718466+08:00
6   2019-03-05 00:51:36.520325894+08:00
7   2019-03-05 00:52:36.523945647+08:00
8   2019-03-05 00:53:36.548567617+08:00
9   2019-03-05 00:54:36.740268213+08:00
Name: Date-Time, dtype: datetime64[ns, Asia/Shanghai]

或timedeltas:

print (df[pd.to_timedelta(df.dt.strftime('%H:%M:%S')) > '00:00:00'])

最新更新