对于下面的函数,我没有得到任何错误,但在保存this.py的同一目录中没有出现任何文件。有人知道为什么会这样吗?
import itertools
TAG_RE = re.compile(r'<[^>]+>')
def remove_tags(text):
return TAG_RE.sub('', text)
def get_rid_of_html():
print("removing html tags")
with open('messages.htm', "r", encoding="utf8") as f:
lines = f.read()
print(len(lines))
lines_removed_html = remove_tags(lines)
with open('messages_txt_remove_list.txt', 'w', encoding="utf8") as f:
f.write(lines_removed_html)
get_rid_of_html()
代码中缺少正则表达式模块。您需要re.compile(r'<[^>]+>')
:的模块
import re
TAG_RE = re.compile(r'<[^>]+>')
另请查看此处:https://docs.python.org/3/library/re.html