我使用jupyter和pandas来理解数据库中的一些模式,我在表中有2个日期格式,'create_time'和'active_time'。
如果我使用
pf['create_time'] = pd.to_datetime(pf['create_time'],format="%d/%m/%Y")
我得到以下错误
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
C:ProgramDataAnaconda3libsite-packagespandascoretoolsdatetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
455 try:
--> 456 values, tz = conversion.datetime_to_datetime64(arg)
457 dta = DatetimeArray(values, dtype=tz_to_dtype(tz))
pandas_libstslibsconversion.pyx in pandas._libs.tslibs.conversion.datetime_to_datetime64()
TypeError: Unrecognized value type: <class 'str'>
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-66-3881c9561812> in <module>
----> 1 pf['create_time'] = pd.to_datetime(pf['create_time'],format="%d/%m/%Y")
C:ProgramDataAnaconda3libsite-packagespandascoretoolsdatetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
799 result = result.tz_localize(tz)
800 elif isinstance(arg, ABCSeries):
--> 801 cache_array = _maybe_cache(arg, format, cache, convert_listlike)
802 if not cache_array.empty:
803 result = arg.map(cache_array)
C:ProgramDataAnaconda3libsite-packagespandascoretoolsdatetimes.py in _maybe_cache(arg, format, cache, convert_listlike)
176 unique_dates = unique(arg)
177 if len(unique_dates) < len(arg):
--> 178 cache_dates = convert_listlike(unique_dates, format)
179 cache_array = Series(cache_dates, index=unique_dates)
180 return cache_array
C:ProgramDataAnaconda3libsite-packagespandascoretoolsdatetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
458 return DatetimeIndex._simple_new(dta, name=name)
459 except (ValueError, TypeError):
--> 460 raise e
461
462 if result is None:
C:ProgramDataAnaconda3libsite-packagespandascoretoolsdatetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
421 if result is None:
422 try:
--> 423 result, timezones = array_strptime(
424 arg, format, exact=exact, errors=errors
425 )
pandas_libstslibsstrptime.pyx in pandas._libs.tslibs.strptime.array_strptime()
ValueError: time data '#VALOR!' does not match format '%d/%m/%Y' (match)
但是如果我在active_time上做同样的事情,没有任何错误。
我的问题是,我如何找到这个错误在我的数据库(create_time)使用熊猫?我试图在excel上找到这个错误,但没有找到任何东西。csv文件超过50万行
这是我的csv文件的一个示例:
owner_id,create_time,active_time
123,05/10/2021,05/10/2021
123,04/10/2021,04/10/2021
234,25/08/2021,25/08/2021
345,17/08/2021,02/10/2021
456,16/10/2020,24/09/2021
答案是
pf.loc[pf['create_time']=='#VALOR!']