There are many readily available softwares (e.g. DjVu2PDF) for converting a book from
Having a table of contents is very handy. For example when viewing a book in Preview, the table of contents works like a multi-level bookmark, you can simply click on any link in the sidebar to jump to any chapter/section of the book.
So I Googled and found this quetion on StackExchange that asked exactly my question. Here is a summary of the accepted answer on how you can preserve (or more precisely, create) the table of contents in a PDF converted from Djvu.
Note 1: pdftk for Mac OS X 10.11 and above. I found in this answer on Stack Overflow that the developer of PDFtk provides an installer for PDFtk Server on OS X 10.11 and above. It is kind of strange that the official website only provides the installer for OS X up to 10.8. (This older version can be installed, but won’t run. When you type pdftk commands in the Terminal, it will make you wait forever.)
Note 2: About djvused command line setup on OS X. After installing DjVuLibre, in order to use djvused in command line, you need to run
If this doesn’t add the correct path, you can also manually add the following line into
2. Convert the Table of Contents
(Note: all materials in this section follow closely the original answer on StackExchange, except I coded a very simple python program in Step 2.)
Suppose now you have converted
book.pdf, the former has a table of contents but the latter doesn’t.
Step 1. extract Djvu outline
Use the following command to extract the table of contents from
The output file
bmarks.out lists the table of contents in a serialized tree format using SEXPR, which can be summarized as:
1 2 3 4 5 6 7
Notice that under this format, you can append a “child bookmark” inside a “parent bookmark”. For example, a
bmarks.out may look like this
1 2 3 4 5 6 7 8 9 10 11 12
Step 2. translate the Djvu outline to PDF metadata format
Now, Djvu and PDF store the bookmark data in different formats. While Djvu uses SEXPR, PDF uses metadata, which looks like this:
1 2 3 4 5 6
The example in Step 1 when translated into PDF metadata will look like
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
It is a fun exercise to work out the correspondence of the two formats.
Note: I have written a python program to automatically convert the Djvu SEXPR
bmarks.out into the PDF metadata form and output as
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Step 3. modify PDF metadata to include the bookmark data
Extract PDF metadata with this command:
pdfmetadata.out file, and find the line that begins with
NumberOfPages:, and insert your list of bookmarks after this line. Save the new file as
pdfmetadata.in. Now run this command:
newbook.pdf is your new
book.pdf equiped with a convenient table of contents. Happy reading!