pandas列中子字符串的倒序



我有一个熊猫数据框架,它记录了出版物和作者。

数据帧是这样的:

Title Author
A     A Ala, D Pamucar, EB Tirkolaee
B     A Heydari, S Niroomand
C     F Marisa, SS Syed Ahmad, N Kausar, S Kousar
...

我想把作者的姓和名的顺序颠倒一下,这样姓氏就会排在前面:

Title Author
A     Ala A, Pamucar D, Tirkolaee EB 
B     Heydari A, Niroomand S 
C     Marisa F, Syed Ahmad SS, Kausar N , Kousar S 
...

我正在考虑使用str.split来分割作者,然后使用joinreversed。但是作者的命令也改变了。有更好的解决方案吗?

您可以使用正则表达式。假设这里的名字最多有两个字母,但您可以根据需要进行调整(使用w+代替w{,2}):

df['Author'] = df['Author'].str.replace(r'b(w{,2})bs+b([^,]+)b',
r'2 1', regex=True

输出(作为新列"Author2"为了清晰起见):

Title                                       Author                                      Author2
0     A               A Ala, D Pamucar, EB Tirkolaee               Ala A, Pamucar D, Tirkolaee EB
1     B                       A Heydari, S Niroomand                       Heydari A, Niroomand S
2     C  F Marisa, SS Syed Ahmad, N Kausar, S Kousar  Marisa F, Syed Ahmad SS, Kausar N, Kousar S

正则表达式:

b(w{,2})b   # match first name (up to 2 letters)
s+            # one or more spaces
b([^,]+)b    # one or more non "," characters


df.Author.apply(lambda x: ', '.join([' '.join(i.split()[::-1]) for i in x.split(',')]) )

输出:

0                Ala A, Pamucar D, Tirkolaee EB
1                       Heydari A, Niroomand S
2    Marisa F, Ahmad Syed SS, Kausar N, Kousar S
Name: Author, dtype: object

最新更新