如何将包含dtype:对象的日期的列更改为日期时间,并应用与其他日期时间的差异



我有以下数据:

fip_code           npi                             start_date                                                                                                                                   
0          1      gathering_size_10_0             3/28/2020                                                                                                                                   
1          1      gathering_size_25_to_11         3/19/2020                                                                                                                                   
2          1      non-essential_services_closure  3/28/2020   
.          .                     .                 
.          .                     .                 
.          .                     .                 

并且我想将start_date列的每个值转换为日期时间对象,比如x,然后给定日期时间对象y=2020-03-12 00:00:00时,将start_dame列中的值替换为x-y

以下是用于生成数据帧的代码:

import pandas as pd  
import numpy as np 
from datetime import datetime 
from dateutil import parser
url_npi = 'https://raw.githubusercontent.com/Keystone-Strategy/covid19-interventiondata/master/complete_npis_raw_policies.csv'
df = pd.read_csv(url_npi, error_bad_lines=False)
df = df[['fip_code','npi','start_date']]

好吧,我想好了这个:

df['start_date'] = pd.to_datetime(df['start_date'],infer_datetime_format=True,errors="coerce")
base_str = "3/1/2020"; print("nn base date: ",base_str)
end_str = "4/29/2020"; print("nn end date: ",end_str)
base = pd.to_datetime(base_str)
end = pd.to_datetime(end_str)
df_npi['days_in_effect'] = df_npi.apply(lambda row: (end - row['start_date']).days, axis=1)
df_npi['days_from_base'] = df_npi.apply(lambda row: (row['start_date'] - base).days, axis=1)

最新更新