我需要一个函数来删除链接从我的oldText
列(超过1000行)在熊猫DataFrame。我使用正则表达式创建了它,但它不起作用。这是我的代码:
def remove_links(text):
text = re.sub(r'httpS+', '', text)
text = text.strip('[link]')
return text
df['newText'] = df['oldText'].apply(remove_links)
我没有出错,代码什么也没做
你的代码是为我工作:CSV:
oldText
https://abc.xy/oldText asd
https://abc.xy/oldTe asd
https://abc.xy/oldT
https://abc.xy/old
https://abc.xy/ol
代码:
import pandas as pd
import re
def remove_links(text):
text = re.sub(r'httpS+', '', text)
text = text.strip('[link]')
return text
df = pd.read_csv('test2.csv')
df['newText'] = df['oldText'].apply(remove_links)
print(df)
结果:
oldText newText
0 https://abc.xy/oldText asd asd
1 https://abc.xy/oldTe asd asd
2 https://abc.xy/oldT
3 https://abc.xy/old
4 https://abc.xy/ol