attribute: Phillie Casablanca

Creating PDFs containing Japanese from Sphinx's *.tex output



I've been struggling to get sphinx (http://sphinx.pocoo.org/index.html) output working properly with Japanese text. (I assume this will also work with Chinese and Korean as well, but I haven't tested, and the encoding portion may not work or need to be changed)

I've finally got a solution that works for me.

Previously I had been using MiKTeX to convert sphinx's *.tex output to pdf, but hit a wall when trying to convert Japanese characters.

Latex seems to be a mess when dealing with non-ascii characters.

Using w32tex (http://www.fsci.fuk.kindai.ac.jp/kakuto/win32-ptex/web2c75.html) I was able to successfully convert the output *.tex file Sphinx creates to pdf.

The method I used is described below.

Installation Steps
====================

1. Create C:\w32tex directory

2. Create C:\w32tex\archivedpackages directory

3. Download "texinst757.zip" from http://www.fsci.fuk.kindai.ac.jp/kakuto/win32-ptex/web2c75.html

4. Unzip "texinst757.zip" into C:\w32tex

5. Download ALL files from the "最小インストール"(Minimal Install) and "標準インストール" (Standard install) lists and place them in C:\w32tex\archivedpackages

6. In addition download the following packages from the "フルインストール" (Full Install) list


ums.tar.gz
omegaj-w32.tar.gz
ttf2pt1-w32.tar.bz2
utf.tar.gz -- not sure if it's needed
uptex-w32.tar.bz2 -- not sure if it's needed

7. Download and install ghostscript ftp://akagi.ms.u-tokyo.ac.jp/pub/TeX/win32-gs/gs863w32full-gpl.zip
A. Check the "Use Windows TrueType fonts for Chinese, Japanese and Korean" option and Install into C:\w32tex\gs

8. Download and install IPA fonts from http://lx1.avasys.jp/OpenPrintingProject/openprinting-jp-0.1.3.tar.gz
A. Drag and drop the *.ttf files found in the archive to the Windows/Fonts directory.
- this will install the fonts on the pc

9. Run the following command:

	
texinst757.exe C:\w32tex\archivedpackages

NOTE: This will take some time

10. Add the following to your system path:

	
C:\w32tex\bin
C:\w32tex\gs\gs8.63\bin
C:\w32tex\gs\gs8.63\lib


11. Download ftp://cam.ctan.org/tex-archive/macros/latex/contrib/titlesec.zip, unzip and copy the titlesec folder to C:\w32tex\share\texmf\latex\

12. Run the following command (You may need to restart your console in order to take the new path settings into effect)
- This command will search and update fonts/styles (*i think*) that were added in step 11.


texhash

Installation is now Done!

Converting Sphinx *.tex output to PDF
========================================

Now that you have w32tex installed, you still need to adjust some of your process in order to create a pdf properly.
Note: I'm using sphinx 0.6.1

1. In your conf.py add/uncomment the latex_elements and update to look like this:


latex_elements = {
'preamble': '\usepackage{ums}\input jpdftextounicode\pdfgentounicode=1',
'inputenc': '',
'fontenc': '',
'fontpkg': '',
}

2. Build your *.tex file using sphinx.
- My source files are utf8, and I believe sphinx outputs the *.tex to utf8.

3. Convert the output text to cp932 encoding
- Ideally, topdftex.exe/pdflatex.exe support utf8, but I haven't figured this out, so this is currently a workaround step


import codecs
utf8_f = codecs.open('sphinx_output.tex', 'rb', 'utf8')
cp932_f = codecs.open('sphinx_output.cp932.tex', 'wb', 'cp932')
cp932_f.write(utf8_f.read())
utf8_f.close()
cp932_f.close()

4. Run topdftex.exe on the converted output, renaming to the orignal *.tex name.


topdftex.exe sphinx_output.cp932.tex sphinx_output.tex

5. Run pdflatex.exe on the final output, sphinx_output.tex.

I hope this helps.

monkut // April 22, 2009 // 3:46 a.m.