Chapter 10. Removing Unnecessary Files

Depending on the scope of your application, you may wish to remove files that you know will not be needed, in order to cut down on the size of your application.

10.1. Tool Files

Once you are done building any user-defined dictionaries [59] if necessary (also see
CSC User Dictionaries [78] ), and if you won't be using the
RBL-JE Command Line Utility [31] , you may freely delete the entire root/tools directory.

10.2. Language-Specific Model and Dictionary Files

Generally, you can remove files from root/dicts and root/models [17] that represent languages your application does not need to support. Additionally, if your distribution platform is of a particular endianness, you can elect to remove the models of the opposite endianness. When applicable, the endianness of a file is given at the end of the file name: for example, the file root/ models/eng-model-LE.bin is a little-endian binary storing the English language model, whereas root/models/eng-model-BE.bin is the same but stored in big-endian format. More details are given below in Required Files by Language [82].

10.2.1. Required Files by Language

The following table gives a manifest of files in root/dicts and root/models. For each file, we list the language(s) dependent on the file, as well as if that file is necessary for big-endian and little-endian machines.

File name BE LE Language[s]
root/dicts/ara/*-BE.bin Arabic
root/dicts/ara/*-LE.bin Arabic
root/dicts/ces/*-BE.bin Czech
root/dicts/ces/*-LE.bin Czech
root/dicts/csc/sc2tc_codepoint.txt Chinese (Simplified and Traditional) a
root/dicts/csc/tc2sc_codepoint.txt Chinese (Simplified and Traditional)
root/dicts/csc/*_BE.bin Chinese (Simplified and Traditional)
root/dicts/csc/*_LE.bin Chinese (Simplified and Traditional)
root/dicts/dan/*-BE.bin Danish
root/dicts/dan/*-LE.bin Danish
root/dicts/deu/*-BE.bin German
root/dicts/deu/*-LE.bin German
root/dicts/ell/*-BE.bin Greek
root/dicts/ell/*-LE.bin Greek
root/dicts/eng/*-BE.bin English
root/dicts/eng/*-LE.bin English
root/dicts/fas/*-BE.bin Persian (Western Farsi and Dari)
root/dicts/fas/*-LE.bin Persian (Western Farsi and Dari)
root/dicts/fra/*-BE.bin French
root/dicts/fra/*-LE.bin French
root/dicts/hun/*-BE.bin Hungarian
root/dicts/hun/*-LE.bin Hungarian
root/dicts/ita/*-BE.bin Italian
root/dicts/ita/*-LE.bin Italian
root/dicts/jpn/jla/*_BE.bin Japanese b
root/dicts/jpn/jla/*_LE.bin Japanese
root/dicts/jpn/jla/*_BE_Reading.bin Japanese
root/dicts/jpn/jla/*_LE_Reading.bin Japanese
root/dicts/jpn/jla/JP_stop.utf8 Japanese
root/dicts/jpn/*-BE.bin Japanese
root/dicts/jpn/*-LE.bin Japanese
root/dicts/nld/*-BE.bin Dutch
root/dicts/nld/*-LE.bin Dutch
root/dicts/nno/*-BE.bin Norwegian (Nynorsk)
root/dicts/nno/*-LE.bin Norwegian (Nynorsk)
root/dicts/nob/*-BE.bin Norwegian (Bokmål)
root/dicts/nob/*-LE.bin Norwegian (Bokmål)
root/dicts/pol/*-BE.bin Polish
root/dicts/pol/*-LE.bin Polish
root/dicts/por/*-BE.bin ✓ Portuguese Portuguese
root/dicts/por/*-LE.bin Portuguese
root/dicts/rus/*-BE.bin Russian
root/dicts/rus/*-LE.bin ✓ Russian
root/dicts/spa/*-BE.bin Spanish
root/dicts/spa/*-LE.bin ✓ Spanish
root/dicts/swe/*-BE.bin Swedish
root/dicts/swe/*-LE.bin Swedish
root/dicts/tha/*-BE.bin Thai
root/dicts/tha/*-LE.bin Thai
root/dicts/urd/*-BE.bin Urdu
root/dicts/urd/*-LE.bin Urdu
root/dicts/zho/cla/*_BE.bin Chinese (Simplified and Traditional) c
root/dicts/zho/cla/*_LE.bin Chinese (Simplified and Traditional)
root/dicts/zho/cla/zh_stop.utf8 Chinese (Simplified and Traditional)
root/dicts/zho/*-BE.bin Chinese (Simplified and Traditional)
root/dicts/zho/*-LE.bin Chinese (Simplified and Traditional)
root/models/ara-model-BE.bin Arabic
root/models/ara-model-LE.bin Arabic
root/models/dinflections-BE.bin Hebrew
root/models/dinflections-LE.bin Hebrew
root/models/dprefixes.data Hebrew
root/models/eng-model-BE.bin English
root/models/eng-model-LE.bin English
root/models/gimatria.data Hebrew
root/models/jpn-model-BE.bin Japanese d
root/models/jpn-model-LE.bin Japanese
root/models/jpn.model Japanese
root/models/kor-model-BE.bin Korean
root/models/kor-model-LE.bin Korean
root/models/kor.model Korean
root/models/tha.model Thai
root/models/zho.model Chinese (Simplified and Traditional) e
a. The CSC files are only necessary for Chinese Script Conversion [72] .
b. The JLA files are only necessary if the alternativeTokenization option is set. For more information, see Tokenizers [9] and the Table of API Options [86].
c. The CLA files are only necessary if the alternativeTokenization option is set. For more information, see Tokenizers [9] and the Table of API Options [86] .
d. The three Japanese model files (jpn-model-BE.bin, jpn-model-LE.bin, and jpn.model) are only necessary when alternativeTokenization is set to false.
e. zho.model is only necessary when alternativeTokenization is set to false.

results matching ""

    No results matching ""