Appendix C. Japanese Lemma Normalization
In Japanese, foreign and borrowed words may vary in their phonetic transcription to Katakana, and some words may be expressed with an older or a modern Kanji form. The Japanese lemma dictionary maps Katakana variants to a standard form and old Kanji forms to their modern forms.
Examples:
| Katakana Spelling Variants | Normalized Form |
|---|---|
| ヴァイオリン | バイオリン |
| エクスポ | エキスポ |
| Older Kanji Form | Normalized Form |
|---|---|
| 渡邊 | 渡辺 |
| 松濤 | 松涛 |
| 大學 | 大学 |
You can include orthographic normalization in lemma user dictionaries for Japanese. This information can be accessed at runtime from the Analysis or MorphoAnalysis object.