我使用 python-docx 编写了一个脚本来搜索 word 文档(通过搜索运行(以查找参考编号和技术关键字,然后创建一个表格来汇总附加到 word 文档末尾的搜索结果。
有些文档是 100+ 页,因此我想通过在搜索结果表中创建内部超链接来让用户更轻松,因此它会将您带到文档中检测到搜索结果的位置。
找到参考运行后,我不知道如何将其标记为书签或如何在结果表中创建指向该书签的超链接。
我能够使用此页面中的代码为外部 url 创建书签 使用 python-docx 在 MSWord 中添加超链接
我也尝试创建书签,我找到了这个页面: https://github.com/python-openxml/python-docx/issues/109
标题与创建书签有关,但代码似乎在 Word 中生成数字。
我觉得这两个解决方案可以放在一起,但我对 xml/word 文档没有足够的了解来做到这一点。
更新:我找到了一些将书签添加到word文档的代码,现在需要的是一种使用word文档中的链接链接到它的方法 https://github.com/python-openxml/python-docx/issues/403
*from docx import Document
def add_bookmark(paragraph, bookmark_text, bookmark_name):
run = paragraph.add_run()
tag = run._r # for reference the following also works: tag = document.element.xpath('//w:r')[-1]
start = docx.oxml.shared.OxmlElement('w:bookmarkStart')
start.set(docx.oxml.ns.qn('w:id'), '0')
start.set(docx.oxml.ns.qn('w:name'), bookmark_name)
tag.append(start)
text = docx.oxml.OxmlElement('w:r')
text.text = bookmark_text
tag.append(text)
end = docx.oxml.shared.OxmlElement('w:bookmarkEnd')
end.set(docx.oxml.ns.qn('w:id'), '0')
end.set(docx.oxml.ns.qn('w:name'), bookmark_name)
tag.append(end)
doc = Document("test_input_1.docx")
# add a bookmakr to every paragraph
for paranum, paragraph in enumerate(doc.paragraphs):
add_bookmark(paragraph=paragraph, bookmark_text=f"temp{paranum}", bookmark_name=f"temp{paranum+1}")
doc.save("output.docx")*
已解决: 我从这篇文章中得到了它,将超链接添加到书签
这是关键线
hyperlink.set(docx.oxml.shared.qn('w:anchor'), link_to,)
作为奖励,我添加了将工具提示添加到您的链接的功能:
享受
答案如下:
from docx import Document
import docx
from docx.enum.dml import MSO_THEME_COLOR_INDEX
def add_bookmark(paragraph, bookmark_text, bookmark_name):
run = paragraph.add_run()
tag = run._r
start = docx.oxml.shared.OxmlElement('w:bookmarkStart')
start.set(docx.oxml.ns.qn('w:id'), '0')
start.set(docx.oxml.ns.qn('w:name'), bookmark_name)
tag.append(start)
text = docx.oxml.OxmlElement('w:r')
text.text = bookmark_text
tag.append(text)
end = docx.oxml.shared.OxmlElement('w:bookmarkEnd')
end.set(docx.oxml.ns.qn('w:id'), '0')
end.set(docx.oxml.ns.qn('w:name'), bookmark_name)
tag.append(end)
def add_link(paragraph, link_to, text, tool_tip=None):
# create hyperlink node
hyperlink = docx.oxml.shared.OxmlElement('w:hyperlink')
# set attribute for link to bookmark
hyperlink.set(docx.oxml.shared.qn('w:anchor'), link_to,)
if tool_tip is not None:
# set attribute for link to bookmark
hyperlink.set(docx.oxml.shared.qn('w:tooltip'), tool_tip,)
new_run = docx.oxml.shared.OxmlElement('w:r')
rPr = docx.oxml.shared.OxmlElement('w:rPr')
new_run.append(rPr)
new_run.text = text
hyperlink.append(new_run)
r = paragraph.add_run()
r._r.append(hyperlink)
r.font.name = "Calibri"
r.font.color.theme_color = MSO_THEME_COLOR_INDEX.HYPERLINK
r.font.underline = True
# test the functions
if __name__ == "__main__":
# input test document
doc = Document(r"test_input_1.docx")
# add a bookmark to every paragraph
for paranum, paragraph in enumerate(doc.paragraphs):
add_bookmark(paragraph=paragraph,
bookmark_text=f"{paranum}", bookmark_name=f"temp{paranum+1}")
# add page to the end to put your link
doc.add_page_break()
paragraph = doc.add_paragraph("This is where the internal link will live")
# add a link to the first paragraph
add_link(paragraph=paragraph, link_to="temp0",
text="this is a link to ", tool_tip="your message here")
doc.save(r"output.docx")
以前的解决方案在 Libreoffice (6.4( 上不适用于我。
检查 2 个文档的 xml 后,带书签和不带书签, 同样在检查这个之后: http://officeopenxml.com/WPbookmark.php,我们可以看到:
对于书签解决方案是在段落中添加书签,而不是在运行中。所以在这一行中:
tag = run._r # for reference the following also works: tag = document.element.xpath('//w:r')[-1]
您应该将"('//w:r'("中的"r"更改为"p":
tag = doc.element.xpath('//w:p')[-1]
然后它会工作
对于Link,你必须做同样的事情,这里是函数:
def add_link(paragraph, link_to, text, tool_tip=None):
# create hyperlink node
hyperlink = docx.oxml.shared.OxmlElement('w:hyperlink')
# set attribute for link to bookmark
hyperlink.set(docx.oxml.shared.qn('w:anchor'), link_to,)
if tool_tip is not None:
# set attribute for link to bookmark
hyperlink.set(docx.oxml.shared.qn('w:tooltip'), tool_tip,)
new_run = docx.oxml.shared.OxmlElement('w:r')
# here to change the font color, and add underline
rPr = docx.oxml.shared.OxmlElement('w:rPr')
c = docx.oxml.shared.OxmlElement('w:color')
c.set(docx.oxml.shared.qn('w:val'), '2A6099')
rPr.append(c)
u = docx.oxml.shared.OxmlElement('w:u')
u.set(docx.oxml.shared.qn('w:val'), 'single')
rPr.append(u)
#
new_run.append(rPr)
new_run.text = text
hyperlink.append(new_run)
paragraph._p.append(hyperlink) # this to add the link in the w:p
# this is wrong:
# r = paragraph.add_run()
# r._r.append(hyperlink)
# r.font.name = "Calibri"
# r.font.color.theme_color = MSO_THEME_COLOR_INDEX.HYPERLINK
# r.font.underline = True
这是一个受@timmydoger启发的版本,基于SubElement
:
from docx.oxml.ns import qn
from lxml.etree import SubElement
def add_bookmark(paragraph, bookmark_text, bookmark_name):
run = paragraph.add_run()
r = run._r
SubElement(r, qn('w:bookmarkStart'), {
qn('w:id'): '0',
qn('w:name'): bookmark_name,
})
SubElement(r, qn('w:r'), {
qn('w:t'): bookmark_text,
})
SubElement(r, qn('w:bookmarkEnd'), {
qn('w:id'): '0',
qn('w:name'): bookmark_name,
})
def add_hyperlink(paragraph, url, fragment, text):
part = paragraph.part
r_id = part.relate_to(
url, RELATIONSHIP_TYPE.HYPERLINK, is_external=True
)
hyperlink = SubElement(paragraph._p, qn('w:hyperlink'), {
qn('r:id'): r_id,
qn('w:anchor'): fragment,
qn('w:history'): '1',
})
r = SubElement(hyperlink, qn('w:r'))
rPr = SubElement(r, qn('w:rPr'))
rStyle = SubElement(rPr, qn('w:rStyle'), {
qn('w:val'): 'Hyperlink',
})
r.text = text