我必须解析一个混合格式的日期列:
0 1972-12-31
1 1980-03-31
2 1980-03-31
3 1973-08-31
4 1985-06-28
...
44215 2017 Nov 17
44216 2009-02-13
44217 2018 Jul 3
44218 2011-03-15
44219 2017 Nov 8
Name: publish_time, Length: 44220, dtype: object
我试着用熊猫来解析它:
pd.datetime.strptime(metadata['publish_time'], '%Y-%m-%d')
但它给了我一个错误:
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:1: FutureWarning: The pandas.datetime class is deprecated and will be removed from pandas in a future version. Import from datetime instead.
"""Entry point for launching an IPython kernel.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-83-fa9f0e16e2d9> in <module>()
----> 1 pd.datetime.strptime(metadata['publish_time'], '%Y-%m-%d')
TypeError: strptime() argument 1 must be str, not Series
知道如何解决这个问题吗?
pd.to_datetime
在识别不同的日期格式方面非常聪明。
像这样的东西会起作用:
In [153]: df = pd.DataFrame({'date': ['1973-08-31','2017 Nov 17', '2009-02-13','2018 Jul 3']})
In [154]: df
Out[154]:
date
0 1973-08-31
1 2017 Nov 17
2 2009-02-13
3 2018 Jul 3
In [155]: df['date'] = pd.to_datetime(df['date'])
In [156]: df
Out[156]:
date
0 1973-08-31
1 2017-11-17
2 2009-02-13
3 2018-07-03