我想把一个列表放在一个字符串上



我从一个文件中得到一个png文件名,然后使用regex指定一个4位数的png文件名,删除标点符号并将其保存到另一个文件

让我感到困惑的是,我试图把列表中的每个单独的值放在一个字符串中,比如:

<div class="parent"><img class="img" title="" src="images/char/{HERE}.png" ></div>

然后保存为:

<div class="parent"><img class="img" title="" src="images/char/1432.png" ></div>
<div class="parent"><img class="img" title="" src="images/char/1250.png" ></div>
<div class="parent"><img class="img" title="" src="images/char/1324.png" ></div>

代码

import re
import pyperclip
def remove_punc(string):
punc = '''!()-[]{};:'", <>./?@#$%^&*_~'''
for ele in string:  
if ele in punc:  
string = string.replace(ele, "") 
return string


text_file = open(r'C:My Web Sites‏‏image_data(1).txt', 'r') 

s = text_file.read()
text_file.close()
string_pattern = r"d{4}." 

regex_pattern = re.compile(string_pattern)

# find all the matches in string one
result = regex_pattern.findall(s)
result = [remove_punc(i) for i in result]

with open(r'C:My Web Sites1.txt', 'w') as fp:
for item in result:
# write each item on a new line
fp.write("%sn" % item)

fp.close()

编辑这是文本文件 的一个示例
<div class="cell-imgs"><div class="character-thumbnail"><img src="resources/images/bgs/5.png" class="character-thumbnail-background"><img class="character-thumbnail-image" src="resources/images/thumb/1535.png" onerror="this.src='resources/images/thumb/noimage.png';"><img rel="popover" tabindex="0" src="resources/images/frames/5.png" class="character-thumbnail-frame" data-html="true" data-trigger="focus" data-toggle="popover" data-placement="bottom" data-content="Rarity: 5★<br/>Level: 1/60<br/>Level: 0/4<br/>Level: 1/5<br/>: 0%" title="" data-original-title="<font color='red'><br/>(version)</font>"><img src="resources/images/elements/3.png" class="character-thumbnail-element"></div><div class="character-thumbnail"><img src="resources/images/bgs/5.png" class="character-thumbnail-background"><img class="character-thumbnail-image" src="resources/images/thumb/1510.png" onerror="this.src='resources/images/thumb/noimage.png';"><img rel="popover" tabindex="1" src="resources/images/frames/5.png" class="character-thumbnail-frame" data-html="true" data-trigger="focus" data-toggle="popover" data-placement="bottom" data-content="Rarity: 5★<br/>Level: 1/80<br/>Level: 4/4<br/>Level: 1/5<br/>: 0%" title="" data-original-title="<font color='#F96700'><br/>(version)</font>"><img src="resources/images/elements/5.png" class="character-thumbnail-element"></div><div class="character-thumbnail"><img src="resources/images/bgs/5.png" class="character-thumbnail-background"><img class="character-thumbnail-image" src="resources/images/thumb/1403.png" onerror="this.src='resources/images/thumb/noimage.png';"><img rel="popover" tabindex="2" src="resources/images/frames/5.png" class="character-thumbnail-frame" data-html="true" data-trigger="focus" data-toggle="popover" data-placement="bottom" data-content="Rarity: 5★<br/>Level: 1/80<br/>Level: 4/4<br/>Level: 1/5<br/>: 0%" title="" data-original-title="<font color='#071BA0'><br/>(version)</font>"><img src="resources/images/elements/4.png" class="character-thumbnail-element"></div><div class="character-thumbnail"><img src="resources/images/bgs/5.png" class="character-thumbnail-background"><img class="character-thumbnail-image" src="resources/images/thumb/1388.png" onerror="this.src='resources/images/thumb/noimage.png';"><img rel="popover" tabindex="3" src="resources/images/frames/5.png" class="character-thumbnail-frame" data-html="true" data-trigger="focus" data-toggle="popover" data-placement="bottom" data-content="Rarity: 5★<br/>Level: 1/80<br/>Level: 4/4<br/>Level: 1/5<br/>: 0%" title="" data-original-title="<font color='#F96700'><br/>(version)</font>"><img src="resources/images/elements/5.png" class="character-thumbnail-element"></div><div class="character-thumbnail"><img src="resources/images/bgs/6.png" class="character-thumbnail-background"><img class="character-thumbnail-image" src="resources/images/thumb/1323.png" onerror="this.src='resources/images/thumb/noimage.png';"><img rel="popover" tabindex="4" src="resources/images/frames/6.png" class="character-thumbnail-frame" data-html="true" data-trigger="focus" data-toggle="popover" data-placement="bottom" data-content="Rarity: 6★<br/>Level: 200/200<br/>Level: 4/4<br/>Level: 1/5<br/>: 150%<br/>1: 0/10<br/>2: 0/10<br/>3: 0/10<br/>" title="<font color='red'><br/>(version)</font>"><img src="resources/images/elements/3.png" class="character-thumbnail-element"></div><div class="character-thumbnail"><img src="resources/images/bgs/5.png" class="character-thumbnail-background"><img class="character-thumbnail-image" src="resources/images/thumb/1322.png"

输出
1535
1510
1403
1388
1323
1322

创建文件可以使用str.format。例如:

s = """<div class="parent"><img class="img" title="" src="images/char/{}.png"></div>"""
result = [1432, 1250, 1324]  # <-- your result with removed punctuations
with open("data.txt", "w") as fp:
for item in result:
print(s.format(item), file=fp)

创建内容为

data.txt
<div class="parent"><img class="img" title="" src="images/char/1432.png"></div>
<div class="parent"><img class="img" title="" src="images/char/1250.png"></div>
<div class="parent"><img class="img" title="" src="images/char/1324.png"></div>

关于作者的更多信息

这个模式应该可以达到(d{4}).(?=png)的效果其中

  • 精确捕获数字4次
  • 以。png
  • 结尾

如果您想添加支持,例如使用jpeg,您可以将模式更改为(d{4}).(?=png|jpeg)

对于在线测试,我编写了这个代码,但它应该可以加载文件,然后使用findall。剩下的工作交给你了。
import re
string = "<div class="parent"><img class="img" title="" src="images/char/1432.png" ></div>n<div class="parent"><img class="img" title="" src="images/char/1250.png" ></div>n<div class="parent"><img class="img" title="" src="images/char/1324.png" ></div>n<div class="parent"><img class="img" title="" src="images/char/1324.jpeg" ></div>"
pattern = re.compile(r'(d{4}).(?=png)')
print(pattern.findall(string))

,其中输出为

['1432', '1250', '1324']

相关内容

最新更新