日期从年到月和周的偏移量



我正在一段时间内提取数据块。它正在从列recvd_dttm中提取日期和时间。这是从一年前开始的所有数据。我想修改它,以便它可以拉一个月或一天,但pd.DateOffset(months=1)给出了KeyError:1错误。如果我把它改成days=7,也会得到同样的错误。但它在years=1的情况下工作得很好。这是怎么回事?

df = pd.read_csv('MYDATA.csv')
# filter by countries with at least one medal and sort
df['recvd_dttm'] = pd.to_datetime(df['recvd_dttm'])
#Only retrieve data before now (ignore typos that are future dates)
mask = df['recvd_dttm'] <= datetime.datetime.now()
df = df.loc[mask]
# get first and last datetime for final week of data
range_max = df['recvd_dttm'].max()
range_min = range_max - pd.DateOffset(years=1)
# take slice with final week of data
df = df[(df['recvd_dttm'] >= range_min) & 
               (df['recvd_dttm'] <= range_max)]

编辑:问题来自代码的其他地方!

您是否尝试过更明确地说明pd。DateOffset作用于?

例如:

range_max = df['recvd_dttm'].max()
range_min = range_max - (df['recvd_dttm']+pd.DateOffset(years=1))

您可以使用pd.tseries.offsets的偏移量族。下面是示例代码:

import pandas as pd
import datetime
# your data
# ================================
df = pd.read_csv('/home/Jian/Downloads/MOCK_DATA.csv', usecols=[1, 4])
df['recvd_dttm'] = pd.to_datetime(df['recvd_dttm'])
mask = df['recvd_dttm'] <= datetime.datetime.now()
df = df.loc[mask]

# flexible offsets
# =======================================
print(range_max)
2015-07-14 16:52:58
# for 1 month: currently there is a bug
# range_min_month = range_max - pd.tseries.offsets.MonthOffset(1)
# for 1 week
range_min_week = range_max - pd.tseries.offsets.Week(1)
print(range_min_week)
2015-07-07 16:52:58
# for 5 days
range_min_day = range_max - pd.tseries.offsets.Day(5)
print(range_min_day)
2015-07-09 16:52:58

您是否考虑使用Unix Epoch Time而不是以较低级的方式格式化的日期?对于转换为Unix时间有一个很好的文档答案,并且处理问题中的偏移量似乎要容易得多,因为滑动范围更容易实现,使用或多或少连续的实数值序列。

尝试使用timedelta代替DateOffset

相关内容

  • 没有找到相关文章

最新更新