How to create Japanese language documents under GNU/Linux using LaTeX

Mark Alford. Last updated: 2008-May-19.

If you just want to create a shift-JIS-encoded Japanese text file then skip to the section on Japanese input in emacs below.

Chinese, Japanese, and Korean language support is available in LaTeX via the CJK package. These instructions work for the TeX Live distribution, which is part of major Linux distributions such as Fedora (9 and later), Ubuntu, etc. For instructions appropriate to the older teTeX distribution, see the previous version of this document.

These are the steps that I followed to get Japanese. They work for CJK-4.7.0 (2006-10-17 version) under Fedora 9.

I did everything as root, because I was adding CJK as a system-wide component, although I put it in a separate tree, /usr/local/share/texmf/, so that it is not mixed up with the main TeX installation and does not get clobbered when TeX is upgraded. But if the sysadmin has set up the TeX search paths correctly, you should be able to install CJK for your personal use by putting everything in $HOME/texmf/ and $HOME/bin/, without being root.

  1. Add Japanese language capability ("CJK") to LaTeX.

    1. At the UK TeX archive you can find CJK and CJK fonts. The latest version of CJK is also supposed to be available at this URL. The only font file you need for Japanese is the kanji one, which should be called something like "kanji48.tar.gz" and be found at this URL. In these instructions I will assume that CJK has been downloaded to ~/cjk-current/, and the kanji font to ~/CJK_fonts/kanji48.tar.gz.
    2. Copy the CJK tex input files to a place where tex will find them:
      > cp -r ~/cjk-current/texinput /usr/local/share/texmf/tex/latex/CJK
      
    3. Install hbf2gf in /usr/local/bin. Under Fedora 9, you will need to have the kpathsea and kpathsea-devel packages installed.
      > cd ~/cjk-current/utils/hbf2gf
      > ./configure --prefix=/usr/local/ \
                    --with-kpathsea-include=/usr/include \
                    --with-kpathsea-lib=/usr/lib
      > make
      > make install
      
      The with-kpathsea-include dir is wherever kpathsea/kpathsea.h (note: not kpathsea.h) lives. The with-kpathsea-lib dir is wherever libkpathsea.a lives. (Use the locate command, eg locate libkpathsea.a). Make sure that the hbf2gf command is now accessible ("which hbf2gf").
    4. Install Japanese fonts in /usr/local/share/texmf/fonts/:
      > cd /usr/local/share/
      > cp ~/CJK_fonts/kanji48.tar.gz .
      > tar -zxf kanji48.tar.gz
      > rm kanji48.tar.gz
      
      Since kanji48.tar.gz contains the files in the proper directory structure, it is sufficient to unpack it in a directory that contains the target texmf directory.
    5. Tell TeX where to find .hbf files, by modifying /usr/share/texmf/web2c/texmf.cnf:
      ----
      MISCFONTS = .;$TEXMF/fonts/misc//;$TEXMF/fonts/hbf//
      ----
      
      For non-root installation it may be possible to set this as an environment variable (see ~/cjk-current/doc/INSTALL and man hbf2gf).
      Important point: Here you have modified a system config file. If you later upgrade your TeX package or your operating system, this change may be overwritten, and you will have to go back and do it again.
      Debugging: You can obtain debugging output from kpathsea by typing "setenv KPATHSEA_DEBUG -1" before running latex or xdvi or dvips (you will want to redirect output to a file). Look for the search that failed: search the output for "Couldn't find", or "failed", and then backtrack to see if it was looking in the right places. A useful command for checking the variable settings in your texmf.cnf is kpsewhich. For example, to see if MISCFONTS is set right:
      > kpsewhich -progname=hbf2gf -expand-var='$MISCFONTS'
      
    6. Update tex database
      > mktexlsr  # (same as "texhash")
      

    Now see if it works. Unfortunately, the example files given in recent versions of CJK do not work! So try my example, (japanese_template.cjk):

    > latex japanese_template.cjk
    

    You should see output indicating that it found the CJK files:

    (/usr/local/share/texmf/tex/latex/CJK/CJK.sty
    (/usr/local/share/texmf/tex/latex/CJK/mule/MULEenc.sty)
    (/usr/local/share/texmf/tex/latex/CJK/CJK.enc))
    (/usr/share/texmf/tex/latex/base/article.cls
    Document Class: article 2001/04/21 v1.4e Standard LaTeX document class
    (/usr/share/texmf/tex/latex/base/size12.clo))
    (/usr/local/share/texmf/tex/latex/CJK/ruby.sty) (./japanese_template.aux)
    (/usr/local/share/texmf/tex/latex/CJK/standard.bdg)
    (/usr/local/share/texmf/tex/latex/CJK/standard.enc)
    (/usr/local/share/texmf/tex/latex/CJK/standard.chr)
    (/usr/local/share/texmf/tex/latex/CJK/JIS/c40song.fd) [1]
    (./japanese_template.aux) )
    Output written on japanese_template.dvi (1 page, 632 bytes).
    Transcript written on japanese_template.log.
    

    Now look at the results:

    > xdvi japanese_template
    

    You should see Japanese!

  2. Use emacs (mule) to produce LaTeX files that include Japanese.

    If you just want to create shift-JIS encoded Japanese text files, you can skip steps 2 and 3 below, and ignore all mentions of ".ckj" files. If you want to do Japanese LaTeX then you will keep each file in two forms. There will be file.tex, which is encoded however you like (I will use shift-JIS encoding, but emacs offers many others), and which you edit and work on. And there is file.cjk, which is JIS-encoded, and which can be LaTeXed.

    1. Install the necessary emacs lisp macros in /usr/local/share/emacs/site-lisp (create this dir if necessary).
      > cd ~/cjk-current/utils/lisp
      > cp cjktilde.el /usr/local/share/emacs/site-lisp/
      > cp emacs-20.3/cjk-enc.el /usr/local/share/emacs/site-lisp/
      
    2. Tell your emacs sessions where to find them. Put the following line into your ~/.emacs file:
      ----
      (setq load-path (cons "/usr/local/share/emacs/site-lisp" load-path))
      (load-library "cjk-enc")
      (global-set-key "\C-c\C-w" 'cjk-write-file) ; CTRL C CTRL W writes CJK file
      ----
      
    3. You can use this template latex file (annotated version). Do not try to use the templates in the CJK examples directory, since they do not work.
      ----
      %-*- coding: japanese-shift-jis; current-input-method: japanese -*-
      \documentclass[12pt]{article}
      \usepackage[overlap, CJK]{ruby}
      \CJKencfamily{JIS}{song}
      \renewcommand{\rubysep}{-0.3ex}
      \begin{document}
      This is in English.
      これは 日本語 です。
      \end{document}
      ----
      
      Note that we specify the coding and input method in a TeX comment on the first line. (Alternatively, this can be done by explicit commands to emacs.) Possible codings include emacs-mule, japanese-euc, japanese-shift-jis, etc. When you load the file into emacs, it will automatically detect the coding and display the characters correctly.

      Shift-jis is also recognized by most web browsers and many email programs, so you can use this method to create Japanese language web pages or send Japanese email. For those applications you may need to delete the "%-*- ..." initial line. The file will still be SJIS-encoded, but you will need to use explicit emacs commands to edit it using emacs.
    4. To go to Japanese input mode, you just type CTRL-\ twice (ignore the "value is nil" message). After that, typing CTRL-\ will toggle back and forth between English and Japanese input methods.
    5. How to enter Japanese text.

      Once you have set the input method to Japanese, you can type in romaji, and it will come out as hiragana, underlined. The underlined bit is the "conversion region", which is still malleable, i.e. not yet fixed. To fix it, and start a new conversion region, press return. To convert it to katakana, type "K" (uppercase). To toggle among a range of kanji for the conversion region, keep pressing space. (To go back and forth in the kanji list use CTRL-P and CTRL-N.) When you get the one you want, press Return to fix it. To get a Japanese "n" type "n" then some non-vowel like "q" (after that you can type "K" for Katakana) then Return to fix it. (For Hiragana "n", the non-vowel can be "n" itself.)
    6. To save the file in its shift-JIS-encoded form, just do the usual CTRL-X CTRL-C. If you are creating a Japanese LaTeX file, you will want to save it using CTRL-C CTRL-W (see the binding in your .emacs file above). This invokes 'cjk-write-file, which saves file.tex, and also creates the JIS-encoded version file.ckj which can be LaTeXed.
    7. To latex the file,
      > latex file.cjk   # [NOT file.tex!]
      
      Then you can use xdvi, dvips, etc to view and print it.

For more general information on Japanese and computing, see Jim Breen's Japanese page.

Please sent comments, corrections, improvements to alford(at)wuphys.wustl.edu. Thanks to those who have done so, including Paul Wyatt of Toshiba and Andrew A. Adams of Reading University.

Mark Alford's IBM Thinkpad GNU/Linux page,

Valid XHTML 1.0!