如何将函数应用于Pandas中序列中的每一行



我有一个只有一列的表。我想将我编写的函数应用于该系列中的每一行。然而,当我这样做的时候,我会得到一个错误!

The table looks like this:        And I want to get this:
names                             names
bank account                      bank account|bank|account
1256864                           1256864
bank share                        bank share|bank|share
42,566                            42,566          
bank currency                     bank currency|bank|currency
Dollar                            Dollar
batch number                      batch number|batch|number
001444                            001444
...                                ...

这是我写的代码:

import pandas as pd
import re

df = pd.read_table('list_a.tsv')
def sep_rows (text):
sperated = '|'.join(re.split(r's+', text))
return text+'|'+sperated
# this applies the function to ALL rows!
print(df['names'].apply(sep_rows))
# I tried to choose every other row
a = df.iloc[::2].apply(sep_rows)
print(a) # But I gen an error!

我得到了这个:

TypeError: expected string or bytes-like object

您的方法(使用reapply(过于复杂且速度缓慢。下面的表达式使用本机Pandas矢量化,并且效率高得多(它的运行速度大约快4倍(。

evens = df['names'].iloc[::2]    
evens[:] = evens + '|' + evens.str.replace('s+', '|')
#                       names
#0  bank account|bank|account
#1                    1256864
#2      bank share|bank|share
#3                     42,566

将文本视为一个序列,然后您的函数应该可以工作:

def sep_rows(text):
separated = text.str.replace(r"s+", "|")
return text + "|" + separated
df.iloc[::2].apply(sep_rows)
names
0   bank account|bank|account
2   bank share|bank|share
4   bank currency|bank|currency
6   batch number|batch|number

另一种获得结果的方法是list comprehension:

import re
df['new_column'] = ["|".join((text, re.sub(r"s+", "|", text))) 
if num%2 ==0 else text 
for num, text in enumerate(df.names)
]
df
names                  new_column
0   bank account    bank account|bank|account
1   1256864                          1256864
2   bank share      bank share|bank|share
3   42,566                           42,566
4   bank currency   bank currency|bank|currency
5   Dollar                           Dollar
6   batch number    batch number|batch|number
7   001444                           001444

相关内容

  • 没有找到相关文章

最新更新