Reverse PDF imposition

2024/10/9 6:25:55

I have an imposed document: there are 4 × n A4 pages on the n sheets. I put them into a roller image scanner and receive one 2 × n paged PDF document (A3).

If, say, n = 3, then I've got the following sequence of A3 pages in my PDF:

  • page one: page 12 (on the left) and page 1 of the original document
  • page two: p.2 and p.11 of the original document
  • page three: p.10 and p.3
  • … and so on until…
  • page six: p.6 and p.7 of the original document

Question: how can I reconstruct the original sequence of pages in one PDF file of the A4 format? I.e. I want to do this:

--A3--         --A4--
[12| 1]         [1]
[ 2|11]         [2]
[10| 3]    ⇒    [3]…             … 
[ 6| 7]         [6][7]… [12]

In linux I usually use pdftk or pdftops-like console utilities for this kind of cases, but I cannot figure out how to use them for my current purpose.

Answer

After a while I found this thread and tuned the code a bit:

import copy
import sys
import math
import pyPdfdef split_pages(src, dst):src_f = file(src, 'r+b')dst_f = file(dst, 'w+b')input_PDF = pyPdf.PdfFileReader(src_f)num_pages = input_PDF.getNumPages()first_half, second_half = [], []for i in range(num_pages):p = input_PDF.getPage(i)q = copy.copy(p)q.mediaBox = copy.copy(p.mediaBox)x1, x2 = p.mediaBox.lowerLeftx3, x4 = p.mediaBox.upperRightx1, x2 = math.floor(x1), math.floor(x2)x3, x4 = math.floor(x3), math.floor(x4)x5, x6 = math.floor(x3/2), math.floor(x4/2)if x3 > x4:# horizontalp.mediaBox.upperRight = (x5, x4)p.mediaBox.lowerLeft = (x1, x2)q.mediaBox.upperRight = (x3, x4)q.mediaBox.lowerLeft = (x5, x2)else:# verticalp.mediaBox.upperRight = (x3, x4)p.mediaBox.lowerLeft = (x1, x6)q.mediaBox.upperRight = (x3, x6)q.mediaBox.lowerLeft = (x1, x2)if i in range(1,num_pages+1,2):first_half += [p]second_half += [q]else:first_half += [q]second_half += [p]output = pyPdf.PdfFileWriter()for page in first_half + second_half[::-1]:output.addPage(page)output.write(dst_f)src_f.close()dst_f.close()if len(sys.argv) < 3:print("\nusage:\n$ python reverse_impose.py input.pdf output.pdf")sys.exit()input_file = sys.argv[1]
output_file = sys.argv[2]split_pages(input_file,output_file)

See this gist.

https://en.xdnf.cn/q/118618.html

Related Q&A

Python: How to run flask mysqldb on Windows machine?

Ive installed the flask-mysqldb module with pip package management system on my Windows machine and I dont know how to run it.I have tried to add the path to the MySQLdb in System properties and still …

Match a pattern and save to variable using python

I have an output file containing thousands of lines of information. Every so often I find in the output file information of the following formInput Orientation: ... content ... Distance matrix (angstro…

Sharing a Queue instance between different modules

I am new to Python and I would like to create what is a global static variable, my thread-safe and process-safe queue, between threads/processes created in different modules. I read from the doc that t…

Square a number with functions in python [duplicate]

This question already has answers here:What does it mean when the parentheses are omitted from a function or method call?(6 answers)Closed last year.This is an extremely easy question for Python. Its…

Changing the cell name

I have a file that contains the following:NameABCD0145ABCD1445ABCD0998And Im trying to write a cod that read every row and change the name to the following format:NameABCD_145ABCD_1445ABCD_998keeping i…

Procfile Heroku

I tried to deploy my first Telegram chatbot (done with Chatterbot library) on Heroku. The files of my chatbot are: requirements (txt file) Procfile (worker: python magghybot.py) botusers (csv file) Mag…

How do i loop a code until a certain number is created?

This task is to determine the difference between two attributes, strength and skill, from game characters. The process for this is:Determining the difference between the strength attributes. The differ…

Finding the longest list in given list that contains only positive numbers in Python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed l…

How to create multiple VideoCapture Objects

I wanted to create multiple VideoCapture Objects for stitching video from multiple cameras to a single video mashup.for example: I have path for three videos that I wanted to be read using Video Captur…

How to read Data from Url in python using Pandas?

I am trying to read the text data from the Url mentioned in the code. But it throws an error:ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2url="https://cdn.upgrad.…