如果有另一列中的项目匹配,则如何迭代PANDAS数据框架并替换字符串



我的 df数据有两个列

thePerson  theText
"the abc" "this is about the abc"
"xyz" "this is about tyu"
"wxy" "this is about abc"
"wxy" "this is about WXY"

我希望结果df

thePerson  theText
"the abc" "this is about <b>the abc</b>"
"xyz" "this is about tyu"
"wxy" "this is about abc"
"wxy" "this is about <b>WXY</b>"

请注意,如果同一行中的theText包含theperson,则在thetext中变得大胆。

我未能成功尝试的解决方案之一是:

df['theText']=df['theText'].replace(df.thePerson,'<b>'+df.thePerson+'</b>', regex=True)

我想知道我是否可以使用lapplymap

进行此操作

我的Python环境设置为2.7

使用re.subzip

tt = df.theText.values.tolist()
tp = df.thePerson.str.strip('"').values.tolist()
df.assign(
    theText=[re.sub(r'({})'.format(p), r'<b>1</b>', t, flags=re.I)
             for t, p in zip(tt, tp)]
)
  thePerson                       theText
0   the abc  this is about <b>the abc</b>
1       xyz             this is about tyu
2       wxy             this is about abc
3       wxy      this is about <b>WXY</b>

复制/粘贴
您应该能够运行此确切的代码并获得所需的结果

from io import StringIO
import pandas as pd
txt = '''thePerson  theText
"the abc"  "this is about the abc"
"xyz"  "this is about tyu"
"wxy"  "this is about abc"
"wxy"  "this is about WXY"'''
df = pd.read_csv(StringIO(txt), sep='s{2,}', engine='python')
tt = df.theText.values.tolist()
tp = df.thePerson.str.strip('"').values.tolist()
df.assign(
    theText=[re.sub(r'({})'.format(p), r'<b>1</b>', t, flags=re.I)
             for t, p in zip(tt, tp)]
)

您应该看到此

   thePerson                         theText
0  "the abc"  "this is about <b>the abc</b>"
1      "xyz"             "this is about tyu"
2      "wxy"             "this is about abc"
3      "wxy"      "this is about <b>WXY</b>"

您可以使用apply

df['theText'] = df.apply(lambda x: re.sub(r'('+x.thePerson+')',
                                          r'<b>1</b>', 
                                          x.theText, 
                                          flags=re.IGNORECASE), axis=1)
print (df)
  thePerson                       theText
0   the abc  this is about <b>the abc</b>
1       xyz             this is about tyu
2       wxy             this is about abc
3       wxy      this is about <b>WXY</b>

最新更新