There are many readily available softwares (e.g. DjVu2PDF) for converting a book from .djvu
to .pdf
format, but none of those will preserve the table of contents in the output PDF.
Having a table of contents is very handy. For example when viewing a book in Preview, the table of contents works like a multi-level bookmark, you can simply click on any link in the sidebar to jump to any chapter/section of the book.
So I Googled and found this quetion on StackExchange that asked exactly my question. Here is a summary of the accepted answer on how you can preserve (or more precisely, create) the table of contents in a PDF converted from Djvu.
1. Preliminary
You will need to install pdftk (part of PDFtk Server) and djvused (part of DjVuLibre)
Note 1: pdftk for Mac OS X 10.11 and above. I found in this answer on Stack Overflow that the developer of PDFtk provides an installer for PDFtk Server on OS X 10.11 and above. It is kind of strange that the official website only provides the installer for OS X up to 10.8. (This older version can be installed, but won’t run. When you type pdftk commands in the Terminal, it will make you wait forever.)
Note 2: About djvused command line setup on OS X. After installing DjVuLibre, in order to use djvused in command line, you need to run
1
|
|
If this doesn’t add the correct path, you can also manually add the following line into ~/.bash_profile
1
|
|
2. Convert the Table of Contents
(Note: all materials in this section follow closely the original answer on StackExchange, except I coded a very simple python program in Step 2.)
Suppose now you have converted book.djvu
into book.pdf
, the former has a table of contents but the latter doesn’t.
Step 1. extract Djvu outline
Use the following command to extract the table of contents from book.djvu
1
|
|
The output file bmarks.out
lists the table of contents in a serialized tree format using SEXPR, which can be summarized as:
1 2 3 4 5 6 7 |
|
Notice that under this format, you can append a “child bookmark” inside a “parent bookmark”. For example, a bmarks.out
may look like this
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Step 2. translate the Djvu outline to PDF metadata format
Now, Djvu and PDF store the bookmark data in different formats. While Djvu uses SEXPR, PDF uses metadata, which looks like this:
1 2 3 4 5 6 |
|
The example in Step 1 when translated into PDF metadata will look like
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
It is a fun exercise to work out the correspondence of the two formats.
Note: I have written a python program to automatically convert the Djvu SEXPR bmarks.out
into the PDF metadata form and output as bmarks2.txt
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
Step 3. modify PDF metadata to include the bookmark data
Extract PDF metadata with this command:
1
|
|
Open the pdfmetadata.out
file, and find the line that begins with NumberOfPages:
, and insert your list of bookmarks after this line. Save the new file as pdfmetadata.in
. Now run this command:
1
|
|
The output newbook.pdf
is your new book.pdf
equiped with a convenient table of contents. Happy reading!