Manipulating PDF documents in Linux

pdftk-The pdf toolkit

Creating and reading PDF files in Linux is easy, but manipulating existing PDF files is a little trickier.The PDF Toolkit (pdftk) claims to be that all-in-one solution.

Pdftk allows you to manipulate PDF easily and freely. It does not require Acrobat, and it runs on Windows, Linux, Mac OS X, FreeBSD and Solaris.Pdftk is free software (GPL).It can join and split PDFs; pull single pages from a file; encrypt and decrypt PDF files; add, update, and export a PDF's metadata; export bookmarks to a text file; add or remove attachments to a PDF; fix a damaged PDF; and fill out PDF forms.It is a command-line tool.

Below are some of the commands which can be used for the different applications:

  • Joining files

    pdftk file1.pdf file2.pdf cat output newFile.pdf

    cat is short for concatenate -- that is, link together and output tells pdftk to write the combined PDFs to a new file.

  • Splitting files

    pdftk mydocument.pdf burst

    The burst option breaks a PDF into multiple files -- one file for each page.

    To remove the specific pages from a file.For example,to remove 10-25 pages from a PDF file:

    pdftk myDocument.pdf cat 1-9 26-end output removedPages.pdf

  • Adding attachments 
    Pdftk can attach binary and text files to a PDF with ease. You can even specify what page of the PDF you want the attachment to appear on.

    pdftk html_tidy.pdf attach_files command_ref.html to_page 24 output html_tidy_book.pdf

  • Filling out forms 
    Using pdftk's fill_form option, forms can be filled using information in a separate file.For this a Form Data Format (FDF) file containing the data that has to be merged into the form is first created using pdftk's generate_fdf directive.

    The FDF file contains the names of each field in the PDF and the values you want to enter into those fields. The FDF file also contains a link to the name of the PDF form. An FDF file looks something like this:

    %FDF-1.2 1 0 obj << /FDF << /Fields [ << /T (Name_field) /V (Fred Langan) >> << /T (Address_field) /V (1313 Mockingbird Lane) >> << /T (Age_field) /V (53) >>] /F (info_form.pdf) >> >> endobj trailer << /Root 1 0 R >> %EOF

    To fill out the form using an FDF file, use a command like this:

    pdftk survey_form.pdf fill_form survey_answers.fdf output filled_survey.pdf