Reading and Writing Japanese

Reading and writing Japanese is totally different from reading and writing a Western language. We have an alphabet, which consists of a bunch of totally symbolic letters (they have no meaning in themselves), from which we can form any word. If you already know English, there are many Indo-European languages that you will be able to read and write fairly easily (French, German, Spanish, etc) because they use essentially the same alphabet. Your only job when learning these languages is to learn the new grammar and vocabulary.

And there are also a number of other languages— Russian, for instance— where you would need to learn a new alphabet, but at least the whole concept of an alphabet is preserved in that language.

(Note that speaking and hearing these languages may still be challenging— all I’m asserting is that reading and writing them are mentally straightforward because you know intuitively how alphabets work).

However, not all languages use alphabets. Some use characters, which are full representations of ideas (they have meaning in themselves). Instead of arranging a small set of meaningless letters into thousands of permutations to impart meaning, these languages use thousands of characters (and combinations of several characters), each of which has inherent meaning. Chinese, in all its forms, is the most well-known character-based language.

Japanese is actually a hybrid language. It has characters, called kanji, which were in fact borrowed and adapted from Chinese (beginning around the 6th century). But it also has symbolic “letters,” which are actually syllabaries— each one is a syllable, such as ka, ho, or mi. There are two complete syllabaries— two ways of writing the same set of sounds— and together they are called kana. (Separately, they are called hiragana and katakana).

There are many ways for Westerners to learn to read and write Japanese, and some people will argue vehemently that one method is much better than another. I think whichever method ends up working for you is a good one. Because Japanese is so incredibly different from our alphabet system, one way to get your head around it is to proceed in stages.

The following sections describe all the steps you might take if you were learning Japanese in this stepwise approach.

Romaji

When I first started learning Japanese, I was taking a course designed for businesspeople who needed to learn “survival Japanese” fairly quickly, so the emphasis was on teaching us simple things to say. So the first written Japanese I learned was “romaji”—romanized Japanese. That’s what I’ve been writing all along in this tutorial. The Japanese words are written using Roman letters that mimic the way the word is pronounced.

Actually, there are a whole bunch of systems for romanizing Japanese. That’s because Japanese sounds are not exactly like English sounds, and in addition, some sounds don’t quite follow the apparent pattern, so you have the option of romanizing the actual sound or the theoretical sound. For instance, the syllables starting with “k” are ka ki ku ke ko. But the ones starting with “s” sound like sa shi su se so. That sound is sometimes romanized as shi and sometimes as si (even though it’s always pronounced “shi”).

I am (for the most part) using a fairly intuitive system called “modified Hepburn.” It’s one of the easiest to read because sounds are written as close to their actual pronunciation as possible. (I’ve bent the rules a little on some long vowels— sorry). There’s also a system called “modified Kunrei-Shiki,” which I think is harder. Here is an example (“I didn’t go anywhere on Sunday”):

Nichiyoubi wa doko e mo dekakemasen deshita (modified Hepburn)
Nitiyoobi wa doko e mo dekakemasen desita (modified Kunrei-Shiki)

However, you don’t get very far into your Japanese studies before you realize that, frankly, romaji is baloney. Japanese people don’t use it, and worse than that, romaji actually obscures some really important meaning in the language. With romaji, you don’t realize that the too of toomi and the en of engoku are the same character, nor do you realize that the latter is not the same character as the en of enkatsu.

Although I have chosen not to include kana and kanji characters in this text, the following sections explain a bit about reading and writing real Japanese.

Hiragana

Hiragana is one of the syllabaries mentioned above. There are 46 basic hiragana in all, plus a few combinations such as nyu, which is ni plus yu (and is written as a normal-sized ni with a small yu as a “subscript”). There are no exceptions in pronunciation, by the way! The hiragana for ki is always pronounced ki and no other hiragana can be pronounced ki. Spelling bees are not a concept in Japan.

[OK, OK, there are three exceptions to this rule. The character ha is pronounced wa. The character he is pronounced e when they are being used as particles, rather than as part of regular words. The character wo is pronounced o when it is used as the particle making a direct object. But except for those exceptions, there are no exceptions :-)].

Hiragana is used for some full words, as well as for endings that indicate the inflection of that word. The main “stem” of the word (ie, verb, noun, or “adjective”) is written with kanji (see below), then the ending is in hiragana. For example, the verb yomu, to read, is written with a kanji character plus the hiragana that is pronounced mu. When this verb is inflected to its normal-polite form (yomimasu), it is written with the same kanji character, plus the hiragana for mi, ma, and su.

Katakana

Katakana is the second syllabary— it’s just a different way of writing the same sounds, so there are 46 of these too. Katakana were derived by simplifying or taking just one part of kanji characters that are related to their pronunciation. Only a few of them look like the corresponding hiragana character of the same pronunciation.

At first it seems strange to have two complete (and completely different) ways of expressing the same set of sounds. But think of English: we have lower-case and upper-case letters that are pronounced the same way, and many of them don’t look all that similar (for example, lower-case “a” doesn’t really look like upper-case “A”).

Katakana is used in several ways. First, it is always used when foreign words are written in Japanese (hiragana is never used for this). Many of the foreign words are English, and can be recognized by sounding out the katakana. For instance, there is biiru (beer), and aisukuriimu (ice cream). [These are written in romaji here]. It takes a little practice to figure out what English word is being represented!

There are also words from other foreign languages. There is abekku meaning “couple”— it comes from the French avec [“with”]. And there is arubaito o suru that means “to work one’s way through school.” This comes from the German verb arbeiten, to work.

It was pointed out to me by a reader of this tutorial that katakana can also be challenging to understand because in some cases the meaning has been altered as it has been adapted to Japanese. He gives the excellent example of mai kaa (my car). This really means “personal ownership of a vehicle,” not literally my car. So you can say to someone, Anata no mai kaa desu ka?, which means, “Is that your (personal, your very own) vehicle?”

A second way to use katakana is for emphasis. You will see it a lot in advertisements and in manga (comics) to draw attention to words.

Third, katakana has come to be used for very common Japanese words (not foreign), especially when the kanji is difficult. The idea is that the word is so well-known that the kanji is not needed to distinguish the meaning. For example, sushi is sometimes written in katakana.

But after you’ve learned the syllabaries, you will again reach a depressing moment when you realize that you still can’t read very much Japanese because so much of it is written in kanji.

Kanji

As noted above, kanji are whole words/ideas, like in Chinese. Kanji characters are indeed borrowed from Chinese, but they tend to have different pronunciations and sometimes different meanings in Japanese. In general, a given Japanese kanji character will have one or more on readings, which are the original Chinese ones, as well as one or more kun readings, which are Japanese. Not all characters have both types of reading. Most words that combine two or more kanji will use either all on readings or all kun readings, but there are exceptions.

These multiple readings explain why the too of toomi is the same character as the en of engoku— too is a kun reading, and en is an on reading. Both are ways of pronouncing the character that means “distant” or “far.” Toomi means “watchtower” or “distant view” (since mi means “view”), and engoku means “faraway land” (since goku, or koku, means “country”).

Note that this is exactly the opposite of the kana, which have completely fixed pronunciations and no inherent meaning.

Furigana

Furigana are kana that are written alongside (in vertical writing) or above (in horizontal writing) the corresponding kanji characters, telling how to pronounce them. For instance, in the word yomu mentioned above, the kanji character for the first syllable could be written with a small yo next to it to indicate how to say it.

You can guess one use of furigana: they are sometimes included in things that might be read by children who haven’t learned very many kanji yet. I have seen them in some manga (comics) as well as some newspapers (not the really intellectual ones, of course).

But there is another use for furigana: names. The kanji used for names sometimes have special pronunciations that differ from how you would say the character in a regular word. And some names even use kanji that are unique and not used in any regular words! So it’s not all that uncommon to see names with furigana, even in high-level publications.

[By the way, knowledge of kanji is considered a mark of honor and intelligence in Japan. The more characters you know and can use skillfully, the better (or at least the more high-brow). There are some words that have kanji, but are more commonly simply written with hiragana. Nonetheless, a show-off might write them with the full, proper kanji to sound more intellectual. It’s interesting how language gets used as a social differentiator. We do the same thing here, to some degree, with the use of higher-level vocabulary. But it’s a bigger deal in Japan].

The Pluses and Minuses of Kanji

Do we have anything like kanji in English? Actually, yes. The clearest example is numerals. Each one is a symbol that represents the concept of a number. The “character” 1 has the same meaning in all languages that use those numerals, but I say “one,” my German friends say “ein,” and my Dutch friends say “een.”

But there is a more subtle analogy that draws out some fascinating aspects of kanji. From the example above with the meaning behind too and en (“distant” or “far”), it is clear that kanji can be thought of like the Latin, Greek, and Germanic roots of English words. The analogy isn’t perfect, but it is useful for gaining some insight.

You may not realize it, but you carry in your head a partial dictionary of these roots that allows you to guess the meaning of some English words, as well as to guess how to form new words. If I ask you to make up a nonsense word meaning “able to be cut again,” you would say “recuttable,” right? You know in your head that the prefix re- means “again” and the suffix -able means “able to.” You are also probably aware that this is not the same “re” as the one used in “religion,” or the same “able” as the one used in “parable.”

Now just extend this same concept to the case where re and able are not strings of meaningless letters, but are drawn as unique pictures. That’s the idea of kanji. Once you know a few thousand of these, you can easily put them together into words whose meaning is immediately clear. (If you knew Latin, Greek, and Germanic roots backwards and forwards, you would see a lot more meaning in English words, too).

Remember that Japanese doesn’t have very many sounds (only 46 syllables! English has well over a hundred), and it has no flexibility in spelling when kana are used since each has a unique pronunication with no exceptions. This means that kanji are a real blessing in written Japanese because they distinguish between words that would be written with the same kana (because they are pronounced the same). Some words have wildly different meanings, such as kami, which can be “paper,” “hair,” or “god.” Each of these meanings is written with a different kanji character, even though they’re all pronounced “kami”.

Kanji is all about meaning.

However, the multiple on and kun readings that I mentioned above are the great undoing of kanji’s theoretical simplicity. There is no simple mapping between kanji and pronunciation or vice versa!

So for instance, take the syllable ko. According to my kanji dictionary, no fewer than 60 characters have ko as an on reading, and 6 characters have ko as a kun reading. Going the other way, a typical character that has ko as an on reading— say, the one that means “fire”— also has a second on reading (ka) and two kun readings (hi and ho).

So if I see the “fire” character written as part of a two-character word, I have no idea how to pronounce it. There is, as far as I know, no simple rule that says when to use an on reading versus a kun reading, or which on or kun reading to choose. You have to already know the word in order to be able to say it.

For example, the word that consists of the character for “summer” plus the character for “mountain” is pronounced natsuyama, and it means “mountaineering.” However, the word that consists of the character for “summer” plus the character for “time” is pronounced kaji, and of course it means “summertime.” The point is, how would I know to use kun readings for the first word (natsu for “summer”) and on readings for the second (ka for “summer”)?

Because I’m a dumb American, I might have pronounced the first word kazan, which is the on reading of “summer” plus the on reading of “mountain.” However, that would be a big mistake because kazan actually means “volcano,” and it is properly written with the character for “fire” plus the character for “mountain.” (Remember, I said above that ka is one reading of the character for
“fire”).

Argh! This can get really challenging for the Western brain. I think you just have to learn how to pronounce each word as you learn the word. So like I said above, you have to already know the word in order to be able to say it. This is not true in English! You can take a stab at pronouncing any word, even if you have no clue what it means! More on this below.

Thus, learning new characters is a pain in the neck because each comes with multiple readings that must be memorized along with the meaning(s) and strokes. No wonder it takes kids years and years just to be able to read their own language. I was pretty much done after I learned just 26 letters— all that was left was vocabulary lists. And I mastered those 26 letters when I was four!

Japanese writing is a bit clunky no matter which way you slice it. But then again, it does allow for additional layers of meaning that are impossible in the totally symbolic letters of the West. If you’ll pardon the pun— different strokes for different folks.

The Language Chart

When you “learn a language,” it’s not a uniform process. It’s not like there is some total knowledge pool and you just learn a greater and greater percentage of it as you study.

In fact, language is more of a process or a skill than a body of knowledge. There are a lot of components to language, (at least four: speaking, listening, reading, and writing), and each learner will be more adept at some than at others. I’ve been trying to imagine how to think about all these components, and my first attempt was to come up with this diagram:

The Four Fundamental Language Skills

                           ^   fast

                           |   reaction

         LISTENING         |            SPEAKING

  <—————————————————————————————————————————————————————>

   you're not              |                     you are

   in control of           |               in control of

   the content             |                 the content

        READING            |            WRITING

                           | slow

                           V reaction

This chart shows four key language skills in relation to (1) whether you get to control the words or not, and (2) how fast you have to react to the words. You are in control when you are writing or speaking, because you are creating the words (as opposed to reading and listening, when someone else is doing that). But when you are writing, you get to think carefully about how to phrase things, and how to structure the sentences so that they have the right meaning. You don’t really have time to do this when you are speaking. Similarly, listening is a “fast” activity, while reading is a “slow” one where you can go at your own pace.

Different people will find that they have different strengths. For instance, I know people who have no trouble understanding foreign languages even if they only know a little bit of grammar and vocabulary. But these same people may have trouble actually saying original sentences. There are also people who can flounder through verbal communication using a combination of words and sign language, but find comprehension very difficult. And still others can read and write, but not speak or hear. There are a lot of combinations of skills.

However, as soon as I had drawn this chart and began to think about it, I realized that there are some complications to it. In particular, if you are a Westerner learning an Asian language, it’s a lot different than learning another language that uses Roman letters. To be concrete, if you are a native English speaker learning German, French, or Spanish, you may not realize it, but a lot of the work has already been done even before you walk into class the first day! You already know the alphabet, and even more importantly, you intuitively understand how languages that use alphabets work.

In that case, your job is to learn the grammar and vocabulary of the foreign language, and in all likelihood, there will be many similarities between English and the other Indo-European language you are learning. To be slightly bold, I’ll say it this way: except for learning the vocabulary and proper verb endings, you can already read and write many Indo-European languages.

[You may be laughing at this point— “Yeah, right!” you say. “I took 5 years of French and could barely read The Little Prince!”]

Perhaps. But as you struggled with each sentence of that book, you never seriously doubted how to pronounce any of the words (you could always make a decent guess, even if it wasn’t perfect). And you never looked at a word and found it literally as incomprehensible as a Rorschach blot. At the very least, you could always read the letters, right?

Contrast this with an Asian language that uses characters (such as kanji) instead of an alphabet. This is an entirely different way of conceiving of language. Suddenly my chart above looks simplistic because it treats “Reading” as a single entity.

In languages with characters, you can have the bizarre experience of being able to comprehend a word, phrase, or even a whole sentence, without being able to pronounce it. I have looked at groups of characters and been absolutely certain what they mean. But I had no clue how to say it. Even knowing the individual pronunciations of each character doesn’t always help, because you don’t know if the phrase uses on or kun readings, or which such reading to choose if there are several of that type.

Reading character-based Asian languages is really cool in that you look at the characters and your brain responds by directly understanding the idea behind the characters. There is no need to know the pronunciation of the character. Now, even though you may not know how to pronounce every Western word you read either, there is a profound difference: it is rare to look at a word and instantly know what it means, but not be able to say it. It is far more common to be able to pronounce a word, but have no idea what it means, right?

Here’s an example: what does the English word “obnubilate” mean? If you know Latin, you can guess that it’s a verb meaning “obscure,” “muddle,” or “becloud”. But even if you didn’t know this, I bet you could reasonably guess how to pronounce it.

This just doesn’t happen in Japanese. When you see a character you don’t know, you simply cannot pronounce it.

There’s a lot more meat here to tease out of this concept of there being a split between comprehension and pronunciation. I am certain that it has a profound influence on the differences between Japanese and Western art, sense of humor, etc. Still thinking about that one….

<< Previous lesson

Next lesson >>