Pandas DateTime比较没有产生准确的结果



我有一个数据帧,我试图在其中筛选日期列中的值在StartDateFinishDate之间的值。为了实现这一点,我从这些日期的字符串值中创建了具有pandas.to_datetime的列,然后在此基础上进行筛选。

    result['date'] = pd.to_datetime(result.DateCreated)
    result['StartDate'] = pd.to_datetime(result.StartDate)
    result['FinishDate'] = pd.to_datetime(result.FinishDate)
    result = result[(result.date >= result.StartDate) &
                    (result.date <= result.FinishDate)]

所用数据的部分如下,左边的StartDateFinishDates是上面代码之后的值,右边的是to_datetime 出现问题时包含的初始值I

,date,StartDate,FinishDate,startboundry,finishboundry,DateCreated,StartDate,FinishDate
0,2009-06-08,2009-05-01,2009-06-30,False,True,2009-06-08 00:00:00,2009-05-01,2009-06-30
1,2009-10-08,2009-08-01,2009-12-31,False,True,2009-10-08 00:00:00,2009-08-01,2009-12-31
2,2010-01-28,2010-01-01,2010-04-30,False,True,2010-01-28 00:00:00,2010-01-01,2010-04-30
3,2010-05-27,2010-05-01,2010-06-30,False,True,2010-05-27 00:00:00,2010-05-01,2010-06-30
4,2010-09-22,2010-08-01,2010-12-31,False,True,2010-09-22 00:00:00,2010-08-01,2010-12-31
5,2011-01-13,2011-01-01,2011-04-30,False,True,2011-01-13 00:00:00,2011-01-01,2011-04-30
6,2011-05-26,2011-05-01,2011-06-30,False,True,2011-05-26 00:00:00,2011-05-01,2011-06-30
7,2009-01-20,2009-01-01,2009-04-30,False,True,2009-01-20 00:00:00,2009-01-01,2009-04-30
8,2009-05-11,2009-05-01,2009-06-30,False,True,2009-05-11 00:00:00,2009-05-01,2009-06-30
9,2009-10-05,2009-08-01,2009-12-31,False,True,2009-10-05 00:00:00,2009-08-01,2009-12-31

其中一些将(result.date >= result.StartDate)的初始条件读作False,尽管它们显然是真的。

例如,如果只是进行字符串比较,那么2009-06-08在时间和词汇上都在2009-05-01之后。

已编辑以添加一些版本控制信息:在确保python pandas等版本相同的过程中,收集了版本信息以供分享,以防在这里有所帮助:

pandas版本0.16.2python 2.7.9版本ipython 3.2.0

过滤您可以在之间使用的数据帧

df[df.date1.between(date2,date3)]

相关内容

  • 没有找到相关文章

最新更新