熊猫通过字符串替换或正则表达式将列分成几列

我的数据框中有一个"列"，在最佳条件下，如下所示：

Client: Stack Overflow   Order Num: 123456  Account From: 3656645654   Account to: 546546578

我想将此列拆分为几列，例如：

'Client','Order Num', 'Account From','Account to'

但在某些情况下，我在列中没有客户、订单数和帐户

我是这样做的：

for x in len(df.columns):
if 'Client' in df.loc[x,'Columnn']:
df.loc[x,'Client'] = str(df.loc[x,'Column']).split('Client: ')[1]
if 'Order Num' in df.loc[x,'Client']:
df.loc[x,'Client'] = str(df.loc[x,'Client']).split('Order Num: ')[0]
if 'Account From' in df.loc[x,'Client']:
df.loc[x,'Client'] = str(df.loc[x,'Client']).split('Account From: ')[0]
if 'Account to' in df.loc[x,'Client']:
df.loc[x,'Client'] = str(df.loc[x,'Client']).split('Account to: ')[0]
else:
df.loc[x,'Client'] = ''

对于我要创建的所有列，依此类推。

这部分剧本差不多有40行，速度很慢。

你有更"熊猫"的解决方案吗？

使用字符串访问器尝试此操作，使用正则表达式.str和命名组extract：

df['col1'].str.extract('Client: (?P<Client>.*) Order Num: (?P<OrderNum>.*) Account From: (?P<AccountFrom>.*) Account to: (?P<AccountTo>.*)')

输出：

Client OrderNum   AccountFrom  AccountTo
0  Stack Overflow    123456   3656645654    546546578

相关内容

最新更新

热门标签：