在列表中保存正则表达式匹配项



我有一个帐户文件,如下所示


<A0001><$241><div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1535.png"width="64" height="64"></div><1231>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1510.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1403.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1388.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1323.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1322.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1172.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1069.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/0966.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/0796.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1430.png"width="64" height="64"></div>

<A0002><$111><div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1535.png"width="64" height="64"></div><3112>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1510.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1403.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1388.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1323.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1322.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1172.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1069.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/0966.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/0796.png"width="64" height="64"></div>
<div class="parent"><img class="img" title="" src="/static/assets/images/thumb/1430.png"width="64" height="64"></div>
...

正如你所看到的图像不是一个固定的数字,它们彼此不同。我已经有一个脚本,使用正则表达式来查找图像的名称,但是我如何才能找到文件上的所有图像,并将它们保存到一个列表中,每个索引都具有特定帐户的所有图像名称。这样的


List = [
'src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"'
,
'src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"'

,
'src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"
src="resources/images/thumb/1129.png"'
] # And so on

EDITED:我还编辑了文件的真实外观,因此split()方法可能根本不起作用,抱歉所有的误解

如果我没理解错的话,你是想"分组"图片在特定的部分。例如:

import re
r1 = re.compile(r"Ad+")
r2 = re.compile(r'src="(.*)"')
out, key = {}, None
with open("your_file.txt", "r") as f_in:
for line in f_in:
if r1.match(line):
key = line.strip()
elif (m := r2.match(line)) :
out.setdefault(key, []).append(m.group(1))
print(out)

打印:

{
"A0001": [
"resources/images/thumb/1634.png",
"resources/images/thumb/1234.png",
"resources/images/thumb/1145.png",
"resources/images/thumb/1243.png",
],
"A0002": [
"resources/images/thumb/1129.png",
"resources/images/thumb/1235.png",
],
}

EDIT:只获取图像:

import re
r = re.compile(r'src="(.*)"')
out = []
with open("your_file.txt", "r") as f_in:
for line in f_in:
if (m := r.match(line)) :
out.append(m.group(1))
print(out)

打印:

[
"resources/images/thumb/1634.png",
"resources/images/thumb/1234.png",
"resources/images/thumb/1145.png",
"resources/images/thumb/1243.png",
"resources/images/thumb/1129.png",
"resources/images/thumb/1235.png",
]

相关内容

  • 没有找到相关文章

最新更新