将pandas数据帧列中的日期格式格式化并替换为月份



如果pandas数据帧中存在日期格式,我正试图将其替换为字符串格式的月份,但在过程中出错

代码

def date_format(url_link):
match = re.compile(r'[d]{2,4}[/|-][d]{1,2}[/|-][d]{2,4}')
mo = match.search(url_link)
mo = mo.group().replace('/','-')
try:
mo = datetime.strptime(mo, "%d-%m-%Y").strftime('%Y-%m-%d')
except:
pass
datetime_in = datetime.strptime(mo, "%Y-%m-%d")
datetime_out = datetime_in.strftime("%B")
return datetime_out
texts = [["The date is 11/12/1998"],["The date is 11-12-1998"],["/events/performances"],["/events/2019/02/22/promedica-masterworks/brah"],["/events/performances/641/2019-10-13/dudamel"]]
df = pd.DataFrame(texts, columns = ['event'])
df["date_format"] = df["event"].apply(lambda x: x.replace(r'[d]{2,4}[/|-][d]{1,2}[/|-][d]{2,4}', date_format(x)))

预期输出是具有以下值的新pandas数据帧列

The date is December
The date is December
/events/performances
/events/February/promedica-masterworks/brah
/events/performances/641/October/dudamel

使用str.replace:

# Because my locale is french
# import locale
# locale.setlocale(locale.LC_TIME, 'en_US.UTF-8')
# Add capture group -v-------------------------------------v
match = re.compile(r'([d]{2,4}[/|-][d]{1,2}[/|-][d]{2,4})')
# Replace values
date_to_month = lambda x: pd.to_datetime(x.group(0)).strftime('%B')
df['date_format'] = df['event'].str.replace(match, date_to_month, regex=True)

输出:

>>> df
event                                  date_format
0                         The date is 11/12/1998                         The date is November
1                         The date is 11-12-1998                         The date is November
2                           /events/performances                         /events/performances
3  /events/2019/02/22/promedica-masterworks/brah  /events/February/promedica-masterworks/brah
4    /events/performances/641/2019-10-13/dudamel     /events/performances/641/October/dudamel

最新更新