我有一个熊猫数据框架,它记录了出版物和作者。
数据帧是这样的:
Title Author
A A Ala, D Pamucar, EB Tirkolaee
B A Heydari, S Niroomand
C F Marisa, SS Syed Ahmad, N Kausar, S Kousar
...
我想把作者的姓和名的顺序颠倒一下,这样姓氏就会排在前面:
Title Author
A Ala A, Pamucar D, Tirkolaee EB
B Heydari A, Niroomand S
C Marisa F, Syed Ahmad SS, Kausar N , Kousar S
...
我正在考虑使用str.split
来分割作者,然后使用join
和reversed
。但是作者的命令也改变了。有更好的解决方案吗?
您可以使用正则表达式。假设这里的名字最多有两个字母,但您可以根据需要进行调整(使用w+
代替w{,2}
):
df['Author'] = df['Author'].str.replace(r'b(w{,2})bs+b([^,]+)b',
r'2 1', regex=True
输出(作为新列"Author2"为了清晰起见):
Title Author Author2
0 A A Ala, D Pamucar, EB Tirkolaee Ala A, Pamucar D, Tirkolaee EB
1 B A Heydari, S Niroomand Heydari A, Niroomand S
2 C F Marisa, SS Syed Ahmad, N Kausar, S Kousar Marisa F, Syed Ahmad SS, Kausar N, Kousar S
正则表达式:
b(w{,2})b # match first name (up to 2 letters)
s+ # one or more spaces
b([^,]+)b # one or more non "," characters
df.Author.apply(lambda x: ', '.join([' '.join(i.split()[::-1]) for i in x.split(',')]) )
输出:
0 Ala A, Pamucar D, Tirkolaee EB
1 Heydari A, Niroomand S
2 Marisa F, Ahmad Syed SS, Kausar N, Kousar S
Name: Author, dtype: object