如何使用"C1995"、~2001、2014 等日期清理文本数据;只得到 %yyyy


import pandas as pd
text = [ "y1983 Clinic Hospital", ".2010 - wife; nightmares", " shx of TBI (1975)",
"TSH okay in 2015", "Esophageal cancer, dx: 2013", "8mo in 2009n", "2008 partial thyroidecto"]

d1 = df.str.extract(r'(?P<day>)(?P<month>)^[Ds.. - ~: ; ]?(?P<year>d{4})')
d1
result
monthdayyear
466         1981
470         1983
497         2008

我无法在%yyyy日期格式中找到所有日期格式

我认为您正在寻求从给定字符串中获得年份。如果是这样的话,你可以像下面这样使用正则表达式。

import re
text = [ "y1983 Clinic Hospital", ".2010 - wife; nightmares", " shx of TBI (1975)",
"TSH okay in 2015", "Esophageal cancer, dx: 2013" ,"8mo in 2009n", "2008 partial thyroidecto"]

regex = "[0-9][0-9][0-9][0-9]"
for t in text:
result = re.search(regex, t)
print(result.group())

最新更新