从panda系列中的每一行提取某个字符串的最后一个出现



我想从每一行中提取单词"用户"+紧随其后的数字来自熊猫系列。其他的东西都可以丢弃。你将如何执行此操作?谢谢

这是一个系列的例子:

0                         1 - Unassigned, 2 - User 397335
1         1 - Unassigned, 2 - User 525767, 3 - Unassigned
2                                          1 - Unassigned
3                                          1 - Unassigned
4                                          1 - Unassigned
...                       
163678                                     1 - Unassigned
163679    1 - Unassigned, 2 - User 347991, 3 - Unassigned
163680                                     1 - Unassigned
163681                                     1 - Unassigned
163682    1 - Unassigned, 2 - User 663455, 3 - Unassigned

使用str.findall:

>>> df['A'].str.findall(r'User d+').str[-1]
0         User 397335
1         User 525767
2                 NaN
3                 NaN
4                 NaN
163678            NaN
163679    User 347991
163680            NaN
163681            NaN
163682    User 663455
Name: A, dtype: object

最新更新