0 ABC-12 (s) 部分文本ABC-12(s)错误文本 另一个文本XYZ-LL不需要的文本
尝试清除"模型"后的错误文本"名称"中的值。列。
df = pd.DataFrame([['ABC-12(s)', 'Some text ABC-12(s) wrong text'], ['ABC-45', 'Other text ABC-45 garbage text'], ['XYZ-LL', 'Another text XYZ-LL unneeded text']], columns = ['Model', 'Name'])
另一个解决方案,使用re:
import re
df["Name"] = df.apply(
lambda x: re.split(r"(?<=" + re.escape(x["Model"]) + r")s*", x["Name"])[0],
axis=1,
)
print(df)
打印:
Model Name
0 ABC-12(s) Some text ABC-12(s)
1 ABC-45 Other text ABC-45
2 XYZ-LL Another text XYZ-LL
您可以在列表推导中partition
,然后将前两部分连接起来。
df['name_mod'] = [''.join(name.partition(model)[:-1])
for name,model in zip(df['Name'], df['Model'])]
Model Name name_mod
0 ABC-12(s) Some text ABC-12(s) wrong text Some text ABC-12(s)
1 ABC-45 Other text ABC-45 garbage text Other text ABC-45
2 XYZ-LL Another text XYZ-LL unneeded text Another text XYZ-LL