Appendix C. Japanese Lemma Normalization

In Japanese, foreign and borrowed words may vary in their phonetic transcription to Katakana, and some words may be expressed with an older or a modern Kanji form. The Japanese lemma dictionary maps Katakana variants to a standard form and old Kanji forms to their modern forms.
Examples:

Katakana Spelling Variants Normalized Form
ヴァイオリン バイオリン
エクスポ エキスポ
Older Kanji Form Normalized Form
渡邊 渡辺
松濤 松涛
大學 大学

You can include orthographic normalization in lemma user dictionaries for Japanese. This information can be accessed at runtime from the Analysis or MorphoAnalysis object.

results matching ""

    No results matching ""