提取分隔符之间的所有字符串

我编写了一个函数，用于提取两个分隔符之间的字符串。但是在某些文件中，这些分隔符出现了几次，所以我想提取所有这些。在我的实际函数中，它只提取它遇到的第一个然后退出。

我该如何解决它？

def extraction_error_CF(file): 
f=open(file,'r')
file=f.read()
f.close()
start = file.find('Error validating') #1st delimiter
end = file.find('</SPAN><BR>', start) # 2nd delimiter
if start!=-1 and end!=-1:             #If these two delimiters are present...
return(file[start:end])
else:
return""

对于 HTML/XML，你应该完全使用健壮的模块，如 BeautifulSoup，但是，如果您真的只想在两个分隔符之间显示内容，则可以使用相同的函数，但是将结果添加到列表中(例如(，则可以将其打印出来

def extraction_error_CF(file): 
f=open(file,'r')
file=f.read()
f.close()
# Patterns
first = "Error validating"
second = "</span><br>"
# For all the matches
results = []
# Iterate the whole file
start = file.find(first)
end = file.find(second)
while start != -1 and end != -1:
# Add everything between the patterns
# but not including the patterns
results.append(file[start+len(first):end])
# Removing the text that already passed
file = file[end+len(second):]
start = file.find(first)
end = file.find(second)
# Return the content of the list as a string
if len(results) != 0:
return "".join(r for r in results)
else:
return None
print(extraction_error_CF("test"))

import re
def extraction_error_CF(file): # Get error from CF upload 
f=open(file,'r')
file=f.read()
f.close()
start = re.findall('Error validating(.*)</SPAN><BR>',file)
if start != -1:
return start
else:
return""

这就是我所做的，它运行良好，谢谢大家！

相关内容

最新更新

热门标签：