使用通配符 re.match 找到文件中的行后修改该行

我正在使用套接字检索值后动态重写一个超级简单的 html 页面。从本质上讲，这是从我的挤压框中提取一个曲目名称并尝试将其写入 html。行的第一部分始终相同，但曲目标题需要更改。我确定这非常简单，但我花了几个小时拖网不同的站点并查看差异方法，所以有时间寻求帮助。

HTML中有一行如下，其中包括：

<p class="GeneratedText">Someone Like You</p>

然后，我尝试运行以下内容以找到该行。它总是相同的行号，但我尝试使用读取行，我读取它无论如何都会读取所有内容：

import socket
import urllib
import fileinput
import re
# connect to my squeebox - retricve the track name and clean up ready for insertion
clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
clientsocket.connect(("192.168.1.10", 9090))
clientsocket.send("00:04:00:00:00:00 title ?n")
str = clientsocket.recv(100)
title=str.strip( '00%3A00%3A00%3A00%3A00%3A00 title' );
result = urllib.unquote(title)
#try and overwrite the line in we.html so it looks like <p class="GeneratedText">Now playing track</p>
with open('we.html', 'r+') as f:
        for line in f:
           if re.match("(.*)p class(.*)",line):
              data=line
              print data
              f.write( line.replace(data,'<p class="GeneratedText">'title'</p>'))

一个快速的解决方案可能是使用您尝试导入的文件输入模块。

因此，您的代码将如下所示：

  for line in fileinput.input('we.html', inplace=True):
    if re.match("(.*)p class(.*)",line):
        print line.replace(line, '<p class="GeneratedText">' + title + '</p>')
    else:
        print line

您必须将with块替换为上述块的位置

但是，如果你想要一个更干净的解决方案，你应该看看Beautiful Soup，这是一个用于操作结构化文档的python库。

您仍然需要通过 pip 安装模块并导入BeautifulSoup ，但这段代码应该可以让您在之后运行：

with open('we.html', 'r') as html:
    soup = BeautifulSoup(html)
for paragraph in soup.find_all('p', class_='GeneratedText'):
    paragraph.string = title
with open('we.html', 'w') as html:
    html.write(soup.prettify('utf-8'))

如果您在整个页面中出现过一次这种情况，您只需执行以下操作：

new_html = re.sub('(?<=<p class="GeneratedText">)(.*)(?=</p>)',
                  "WhateverYouWantGoesHere",
                   html_file_as_string)

它将用您想要的任何内容替换 2 个标签之间的所有内容。

with open('output.html', 'w') as o:
    with open('we.html', 'r') as f:
        for line in f:
            o.write(re.sub("(?:psclass="GeneratedText">)(w+s?)+(:?</p>)", newTitle, line))

相关内容

最新更新

热门标签：