我正在尝试从共享驱动器上的不同文件夹(在不同目录中)导入多个文件。但是当我使用更改目录功能时,我只能选择一个路径。
我可以从多个文件夹导入文件吗?想法是导入2-3个不同的文件(.txt或/pdf),并将它们合并到一个输出文件。
到目前为止,我一直在使用以下代码:pip install PyPDF2
from PyPDF2 import PdfFileMerger
import os
chdir = os.chdir("C:/Users/47124")
merger = PdfFileMerger()
input1 = open("File1", "rb") #what is rb and wb?
input2 = open("File2", "rb")
merger.append(fileobj = input1)
merger.append(fileobj = input2)
output = open("document-output.pdf","wb")
merger.write(output)
output.close()
注意:File1和File2在不同的位置;它们不能放在一个文件夹中。
无需在python代码中更改目录,只需将文件名定义为绝对路径(完整路径)。因此,将相对的File1.pdf
或MyFolder/File1.pdf
改为绝对的,例如/home/ptklearning/Documents/MyFolder/File1.pdf
或C:UsersptklearningDocumentsMyFolderFile1.pdf
或任何地方。
示例代码:
merger = bytes()
filenames = [
'/home/nponcian/Documents/GitHub/myproject/src/notes/file1.py',
'/home/nponcian/Documents/Program/file2.txt',
# Add more files as you wish. Note that they would be appended as raw bytes.
]
filename_output = "document-output.txt"
for filename in filenames:
with open(filename, 'rb') as file_input:
file_content = file_input.read()
merger += file_content
with open(filename_output, 'wb') as file_output:
file_output.write(merger)
document-output.txt内容:
# This is file1.py
# A dummy line within file1.py
While this is file2.txt!
Nothing more from this 2nd file...
file2 wants to say goodbye now!
示例代码(使用PDF文件):
from PyPDF2 import PdfFileMerger, PdfFileReader
merger = PdfFileMerger()
filenames = [
'/home/nponcian/Documents/Program/StackOverflow_how_to_python.pdf',
'/media/sf_VirtualBoxFiles/python_cheat_sheet.pdf',
# Add more files as you wish. Note that they would be appended as PDF files.
]
filename_output = "document-output.pdf"
for filename in filenames:
merger.append(PdfFileReader(filename, strict=False))
with open(filename_output, 'wb') as file_output:
merger.write(file_output)
document-output.pdf:
内容<Combined PDF of the input files>
注意:
rb
表示Read-Binarywb
表示Write-Binary(如果你想的话)总是覆盖输出文件)ab
表示追加二进制(如果你只需要将新的合并添加到输出文件)
如果您正在处理的文件不是通常的文本文件(如docx、pdf、mp3等),则需要显式的b
(二进制)。试着用文本编辑器打开它们,你就会知道我的意思:)这样的文件将被读取为python-bytes
对象,而不是python-str
对象。