Python pyPFD2 PDF裁剪页面并合并为单个页面

尝试裁剪几个PDF页面并将它们合并为单个角色样式页面。我需要删除每个页面的石南和页脚，并创建一个角色风格的单页

import PyPDF2
from PyPDF2 import PdfFileReader, PdfFileWriter
from PyPDF2.pdf import PageObject
reader = PdfFileReader('/Users/kic/Desktop/test.pdf','r')
writer = PdfFileWriter()
### find Total Height of file ###
numpages = reader.getNumPages() ## get number of pages
Height = reader.getPage(0).mediaBox.getHeight() ## get height of title page
Height = Height + 482 * (reader.getNumPages()-2) ## add number of height crop pages
### create new single role page ###
Single_page = PageObject.createBlankPage(None, reader.getPage(0).mediaBox.getWidth(),    Height)
### add first title page without croping ###
Single_page.mergeTranslatedPage(reader.getPage(0),0,Height-reader.getPage(0).mediaBox.getHeight(),False)
### loop through all pages from page 2 until last page ###
n=1
for i in range(reader.getNumPages()-1):

i=n
page = reader.getPage(i)
page.cropBox.setUpperLeft((0,556))
page.cropBox.setUpperRight((page.mediaBox.getWidth(),556))
page.cropBox.setLowerLeft((0,74))
page.cropBox.setLowerRight((page.mediaBox.getWidth(),74))
Single_page.mergeTranslatedPage(page,0,482*(numpages-1-n),False)
#writer.addPage(page) ##to see the result of the cropped pages without merging
n = n+1
writer.addPage(Single_page)
output = open('/Users/kic/Desktop/testrcrop.pdf','wb')
writer.write(output)
output.close()

由于某些原因，它没有裁剪，它将页面合并为一个页面，但是石南和页脚彼此重叠。但是，如果我不合并成一个页面，只是将裁切的页面写入PDF文件，其中有几个页面，它们显示为裁切。

您应该翻译裁剪区域，并为其媒体框设置新边界，以确定它们在新页面中所占的位置。
相对于当前位置进行平移，并按绝对值进行裁剪。

根据mergeTranslatedPage()文档，它已经被弃用了，你应该使用add_transformation()和merge_page()。

y_translation = total_height - upper_bound行计算考虑到与新页面上插入的前一页相关的裁剪区域的边界，您需要翻译多少页。

例如:
你有4个页面，高度为800，从第2页到最后，它们将裁剪为600和200，你的新页面的高度应该是2000。
如果你的第一页顶部是800，你必须翻译它到1200，并裁剪2000和1200。第二页的裁剪面积顶部是600，必须是1200(2000 - 800)，所以你需要翻译它600和裁剪1200和800。

T=1200, U=2000, B=1200
T=600, U=1200, B=800
T=200, U=800, B=400
T=-200, U=400, B=0

from PyPDF2 import PageObject, PdfFileReader, PdfFileWriter, Transformation
# Define the crop bounds for pages other than the first page
CROP_Y_TOP = 556
CROP_Y_HEIGHT = 482
# Set the input and output file paths
input_path = r"input.pdf"
output_path = r"output.pdf"
# Open the input and output files in binary read and write mode
with open(input_path, "rb") as input_file, open(output_path, "wb") as output_file:
# Create a PdfFileReader object for the input file
reader = PdfFileReader(input_file)
# Create a PdfFileWriter object for the output file
writer = PdfFileWriter()
# Calculate the total height of the output page
total_height = reader.getPage(0).mediabox.height + (CROP_Y_HEIGHT * (reader.getNumPages() - 1))
# Create a blank page with the calculated total height
single_page = PageObject.create_blank_page(
pdf=None,
width=reader.getPage(0).mediabox.width,
height=total_height
)
# Loop through all pages of the input document
for i in range(reader.getNumPages()):
# Get the current page
page = reader.getPage(i)
original_mediabox = reader.getPage(i).mediaBox
# Determine the upper and lower bounds for the crop
upper_bound = original_mediabox.height if i == 0 else CROP_Y_TOP
lower_bound = 0 if i == 0 else CROP_Y_TOP - CROP_Y_HEIGHT
# Calculate the translation to apply to the page
y_translation = total_height - upper_bound
# Create a transformation object with the calculated translation
transformation = Transformation().translate(ty=y_translation)
# Apply the transformation to the page
page.add_transformation(transformation)
# Update the page media box with the new bounds
page.mediabox.lower_left = (0, lower_bound + y_translation)
page.mediabox.upper_right = (original_mediabox.width, upper_bound + y_translation)
print(f"T={y_translation}tU={upper_bound + y_translation}tL={lower_bound + y_translation}")
# Merge the transformed page onto the output page
single_page.merge_page(page)
# Decrease the total height by the height of the current page
total_height -= upper_bound
# Add the output page to the writer
writer.addPage(single_page)
# Write the output file
writer.write(output_file)

相关内容

最新更新

热门标签：