将"问题{数字}"添加到合并的pdf文档的顶部



对于我自己来说是一个巨大的惊喜,我已经设法用python编写了一些代码,将excel文档中包含的文件路径中的一些pdf文件合并在一起。每个pdf代表一个问题,合并后的文档为学生创建一个问题工作表。目前缺少的是将"问题[数字]"在每个PDF文档的顶部。题目编号将与页码匹配,即第10题在第10页。这是我当前的代码…

import xlwings as xw
import from PyPDF2 import PdfMerger, PdfReader
import openpyxl as op
wbxl = xw.Book('demo.xlsm')
get_links = wbxl.sheets['Sheet1'].range('C2:C5').value
filenames = []
for file in get_links:
if file is not None:
filenames.append(file)
merged_pdf = PdfMerger()
for i in range(len(filenames)):
merged_pdf.append(filenames[i], 'rb')
output_name = wbxl.sheets['Sheet1'].range('C7').value
merged_pdf.write(output_name + ".pdf")

是否有办法在页面上添加几行以获得这些问题编号?此外,这些pdf文件包含图像和表格等,因此转换为word和使用docx python库可能无法工作

使用PyMuPDF使合并pdf和向页面写入文本变得容易。

从您的代码中,我不清楚每个PDF是否只有一页-下面的代码读取filenames中的每个PDF,在每个页面上放置标题行,然后将修改后的PDF附加到所需的输出PDF。

import fitz  # import package PyMuPDF
# assuming 'filenames' = list of PDF file names.
outpdf = fitz.open()  # output PDF
for i, filename in enumerate(filenames, start=1):
src = fitz.open(filename)  # open an input file
page_count = src.page_count  # its number of pages
for page in src:  # iterate through input pages
# put a centered header on each page, half an inch from top
rect = fitz.Rect(0, 36, page.rect.width, 72)
page.insert_textbox(
rect,
f"Question {i}, page {page.number} of {page_count}",
align=fitz.TEXT_ALIGN_CENTER,
)
# input PDF modified, now save to memory, re-open as input and append
pdfbytes = src.tobytes()  # save to memory as bytes object
src = fitz.open("pdf", pdfbytes)  # re-open
outpdf.insert_pdf(src)  # append input PDF
outpdf.save("output.pdf", garbage=3, deflate=True)

Page方法insert_textbox有更多的参数来设置文本颜色,选择所需的字体(默认是Helvetica/Arial),字体大小等

PyMuPDF中的矩形定义为fitz.Rect(left, top, right, bottom)。所以上面的矩形从页面顶部开始36点(72点= 1英寸),高度为36点(0.5英寸)。默认字体大小为11点,因此实际上只使用了大约一半的高度-根据需要更改所有这些

感谢Jorj的指导和想法。我对它进行了修改,因为它不能完全满足我们的需求。我张贴我的代码下面,工作。清理内容部分非常有用,因为它使文本在每个页面上看起来都一样:

def insert_qu_numbers(document):
qu_numbers = fitz.open(document)
counter = 0
for page in qu_numbers:
page.clean_contents()
counter += 1
text = f"Question {counter}"
text_length = fitz.get_text_length(text, fontname= "times-roman")
print(text_length)
rect_x0 = 70
rect_y0 = 50
rect_x1 = rect_x0 + text_length + 35
rect_y1 = rect_y0 + 40
rect = fitz.Rect(rect_x0, rect_y0, rect_x1, rect_y1)
page.insert_textbox(rect, text, fontsize = 16, fontname = "times-roman", align = 0)

最新更新