我正在使用Python来分析一个具有年份范围的列的数据集(参见下面的示例):
Name | 年份范围 | 安迪 | 1985 - 1987 |
---|---|
Bruce | 2011 - 2018 |
您可以在split
函数中使用expand=True
df[['Year Start','Year End']] = df['Years Range'].str.split('-',expand=True)
输出# Nmae Years_Range Year Start Year End
0 NAdy 1995-1987 1995 1987
1 bruce 1890-8775 1890 8775
我认为str.extract
可以胜任这项工作。
下面是一个例子:
df = pd.DataFrame([ "1985 - 1987"], columns = [ "Years Range"])
df['Year Start'] = df['Years Range'].str.extract('(d{4})')
df['Year End'] = df['Years Range'].str.extract('- (d{4})')
df['start']=''#create a blank column name 'start'
df['end']=''#create a blank column name 'end'
#loop over the data frame
for i in range(len(df)):
df['start'][i]=df['Year'][i].split('-')[0]#split each data and store first element
df['end'][i]=df['Year'][i].split('-')[1]#split each data and store second element
https://colab.research.google.com/drive/1Kemzk-aSUKRfE_eSrsQ7jS6e0NwhbXWp scrollTo = esXNvRpnSN9I&行= 1,uniqifier = 1