How to Merge Two or More PDF Files with Python

Merging PDF files is a task that can save you time and keep your documents well-organized.

In this tutorial, I’ll walk you through how to set up Python on your computer and use a script to merge two or more PDF files into one.

So We’ll be using the PyPDF2 library, and by the end, you’ll have a clear understanding of how to run the code on your terminal.

Table of Contents

Setting Up Python on Your Computer

If you don’t have Python installed yet, follow these steps:

Download the latest version of Python from the official Python website.
Then Install it by following the instructions for your operating system (Windows, macOS, or Linux).
And Also Make sure to check the option that says “Add Python to PATH” during installation.
Then To verify if Python is installed correctly, open your terminal (or command prompt) and type:bashCopy codepython --version This should display the installed version of Python.

Installing PyPDF2

Once Python is installed, you can verify the installation by opening your terminal (or command prompt) and typing:

python --version

This will display the installed version of Python.

Installing PyPDF2

Now that Python is set up, the next step is to install the PyPDF2 library, which will allow us to manipulate PDF files. To install it, open your terminal (or command prompt) and run the following command:

pip install PyPDF2

This installs PyPDF2 on your system, which is crucial for merging PDFs. You can find more information on this library in the official PyPDF2 documentation.

Breaking Down the Code

Now, let’s break down the code that will help you merge your PDF files. Here’s how it works, step by step:

import PyPDF2
import os

import PyPDF2: This imports the PyPDF2 library, providing the tools needed to manipulate PDF files.
import os: This imports the os module, which allows your script to interact with the file system, helping you access the PDF files in the specified folder.

Next, we define a function that merges the PDFs:

def merge_pdfs_in_folder(folder_path, output_filename):
    merger = PyPDF2.PdfMerger()

def merge_pdfs_in_folder(folder_path, output_filename):: This function merges all the PDF files in the folder specified by folder_path, saving the result as output_filename.
merger = PyPDF2.PdfMerger():: This creates an instance of the PdfMerger object, which is responsible for combining the PDF files.

Now, we loop through each file in the folder to add all the PDFs to the merger:

for file in os.listdir(folder_path):
    if file.endswith('.pdf'):
        file_path = os.path.join(folder_path, file)
        merger.append(file_path)
        print(f"Adding: {file}")

for file in os.listdir(folder_path):: This loop iterates through every file in the folder specified by folder_path.
if file.endswith('.pdf'):: This checks if the current file is a PDF by looking at the file extension.
file_path = os.path.join(folder_path, file):: This constructs the full path to each PDF file.
merger.append(file_path):: This adds the PDF to the merger.
print(f"Adding: {file}"): A message is printed for each file, showing that it’s being added to the merge process.

Finally, we save the merged PDF:

merger.write(output_filename)
print(f"Merged PDF saved as: {output_filename}")

merger.write(output_filename):: This command writes the merged PDF to the output file.
print(f"Merged PDF saved as: {output_filename}"): Once the file is saved, this message confirms that the process is complete.

Full Code Example

Here’s the complete code that you can copy and run:

import PyPDF2
import os

def merge_pdfs_in_folder(folder_path, output_filename):
    merger = PyPDF2.PdfMerger()
    for file in os.listdir(folder_path):
        if file.endswith('.pdf'):
            file_path = os.path.join(folder_path, file)
            merger.append(file_path)
            print(f"Adding: {file}")
    merger.write(output_filename)
    print(f"Merged PDF saved as: {output_filename}")

# Specify the folder path containing PDFs
folder_path = r'C:\Users\Micheal\Desktop\testpdf'
merge_pdfs_in_folder(folder_path, 'merged_output.pdf')

Running the Code

To run the code, follow these steps:

Save the code in a file with the .py extension, for example, merge_pdfs.py.
Open your terminal (or command prompt) and navigate to the folder where the script is saved.
Run the script by typing:bashCopy codepython merge_pdfs.py

Make sure to adjust the folder path in the script (folder_path) to point to the folder where your PDFs are located. After running the script, you should see the merged PDF in the same folder with the name merged_output.pdf.

And here is the flow chart of the code

flow chart of the code of How to Merge Two or More PDF Files with Python

Get access to the script

To make it even easier for you to get started, we’ve uploaded the Excel Data Cleaner and Copier script to our GitHub account. Here’s how you can find and use it:

Visit Our GitHub Repository

Go to GitHub: Open your web browser and visit our GitHub repository at https://github.com/michealtal/Pdf_Merger.git
Explore the Repository: You’ll see a list of files and folders in the repository. Look for the file named Index.py

Result

Image of Pdf Before Combining

this is the second single pdf before combining

After Running the script

Once the script finishes running, you’ll find the merged PDF file in the location you specified. If you had multiple PDF files in the folder, they should now be combined into one single document.

result of How to Merge Two or More PDF Files with Python

Conclusion

Merging PDFs using Python is a quick and efficient way to handle multiple documents at once. With just a few lines of code, you can merge any number of PDF files into one, making your life a bit easier. Be sure to check out the official documentation for Python and PyPDF2 for more advanced options and features.

If you like blog post like this then you would love our other blog post like How to Automate Image Editing in Python: A Beginner’s Guide

Setting Up Python on Your Computer

Installing PyPDF2

Installing PyPDF2

Breaking Down the Code

Full Code Example

Running the Code

Get access to the script

Visit Our GitHub Repository

Result

Image of Pdf Before Combining

this is the second single pdf before combining

After Running the script

Conclusion

Leave a Reply Cancel reply