PDF Manipulation - Merging and Splitting
Library choice
A common modern option is pypdfpypdf (successor of PyPDF2).
Merge PDFs
pdf_merge.py
from pypdf import PdfMerger
merger = PdfMerger()
merger.append("a.pdf")
merger.append("b.pdf")
merger.write("merged.pdf")
merger.close()pdf_merge.py
from pypdf import PdfMerger
merger = PdfMerger()
merger.append("a.pdf")
merger.append("b.pdf")
merger.write("merged.pdf")
merger.close()Split a PDF
pdf_split.py
from pypdf import PdfReader, PdfWriter
reader = PdfReader("input.pdf")
for i, page in enumerate(reader.pages, start=1):
writer = PdfWriter()
writer.add_page(page)
with open(f"page_{i}.pdf", "wb") as f:
writer.write(f)pdf_split.py
from pypdf import PdfReader, PdfWriter
reader = PdfReader("input.pdf")
for i, page in enumerate(reader.pages, start=1):
writer = PdfWriter()
writer.add_page(page)
with open(f"page_{i}.pdf", "wb") as f:
writer.write(f)Notes
- encrypted PDFs need extra handling
- layout/text extraction is a different problem (next page)
If this helped you, consider buying me a coffee ☕
Buy me a coffeeWas this page helpful?
Let us know how we did
