强制带有小时和分钟的datetime为null pandas



我想削减以下数据的小时/分钟,只保留'YYYY-MM-DD 00:00:00'。

是一个最短的方式比这一个(我想得到一个datetime[ns])作为结果,为什么np.array()强制一个时区…

?
In[229]: index = pd.date_range('2015-01-01', freq = 'H', periods=10)
In[230]: df = pd.DataFrame(index = range(len(index)), data=index)
In[231]: df
Out[230]: 
                    0
0 2015-01-01 00:00:00
1 2015-01-01 01:00:00
2 2015-01-01 02:00:00
3 2015-01-01 03:00:00
4 2015-01-01 04:00:00
5 2015-01-01 05:00:00
6 2015-01-01 06:00:00
7 2015-01-01 07:00:00
8 2015-01-01 08:00:00
9 2015-01-01 09:00:00
In[236]:np.array(pd.to_datetime(pd.Index(index).date))
Out[236]: 
array(['2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100'], dtype='datetime64[ns]')

直接访问.date属性:

In [88]:
index = pd.date_range('2015-01-01', freq = 'H', periods=10).date
df = pd.DataFrame(index = range(len(index)), data=index)
df
Out[88]:
            0
0  2015-01-01
1  2015-01-01
2  2015-01-01
3  2015-01-01
4  2015-01-01
5  2015-01-01
6  2015-01-01
7  2015-01-01
8  2015-01-01
9  2015-01-01

编辑

如果dtype是datetime64,那么您可以通过访问date属性来更改它:

In [97]:
df[0] = df[0].dt.date
df
Out[97]:
            0
0  2015-01-01
1  2015-01-01
2  2015-01-01
3  2015-01-01
4  2015-01-01
5  2015-01-01
6  2015-01-01
7  2015-01-01
8  2015-01-01
9  2015-01-01

如果它已经加载,您可以使用pd.datetools.normalize_date函数,这就是它的目的。

In [1]:
index = pd.date_range('2015-01-01', freq = 'H', periods=10)
df = pd.DataFrame(index = range(len(index)), data=index, columns=['Date'])
df
Out[1]:
     Date
0   2015-01-01 00:00:00
1   2015-01-01 01:00:00
2   2015-01-01 02:00:00
3   2015-01-01 03:00:00
4   2015-01-01 04:00:00
5   2015-01-01 05:00:00
6   2015-01-01 06:00:00
7   2015-01-01 07:00:00
8   2015-01-01 08:00:00
9   2015-01-01 09:00:00
In [142]:
df['Date'] = df['Date'].apply(pd.datetools.normalize_date)
df
Out[142]:
    Date
0   2015-01-01
1   2015-01-01
2   2015-01-01
3   2015-01-01
4   2015-01-01
5   2015-01-01
6   2015-01-01
7   2015-01-01
8   2015-01-01
9   2015-01-01

注意,您也可以在索引上调用normalize: index.normalize()

最新更新