The LanguageModelBuild
task builds a new language model from a set of text files.
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to LanguageModelBuild . |
Yes |
BaseDictionary | The base dictionary for the language model. | |
BuildLabel | The build label to use for the language model. | |
ContentDatabase | The IDOL Content component database to use to retrieve training text. | |
ContentHost | The host name or IP address of the IDOL Content component to retrieve training text from. | |
ContentPort | The ACI port of the IDOL Content component to retrieve training text from. | |
ContentTextTag | The IDOL fields to retrieve text data from. | |
DataList | The list of training text files. | Yes |
DataPath | The path to the directory containing the training text files listed in DataList. | Yes |
DiagFile | The file to write the diagnostic information to. | |
DiagLevel | The level of detail to include in the diagnostic information. | |
DoDctGen | Whether to generate a dictionary. | |
DoNorm | Whether to perform text normalization. | |
DoSmoothing | Whether to enable smoothing. | |
DoSegment | Whether to segment text. | |
DropList | A list of words to exclude from the vocabulary of the custom language model. | |
KeepList | A list of words that must appear in the vocabulary of the custom language model. | |
KeepTemp | Whether to keep the temporary text files for diagnostics | |
Lang | The language pack to use as a foundation. | Yes |
Log | The name of the log file to write. | |
NewDictionary | The dictionary to generate. | Yes |
NewLanguageModel | The custom language model to generate. | Yes |
NewLMInfoFile | The Language Model Information file to generate. | |
VocabSize | The maximum size of the vocabulary to include in the custom language model. |
http://localhost:13000/action=AddTask&Type=LanguageModelBuild&DataList=ListManager/Langmodel&DataPath=C:\LanguageModelFiles&Lang=ENUK-tel&NewLanguageModel=mymodel&NewDictionary=mymodel&DoSmoothing=False
This action uses port 13000
to instruct HPE IDOL Speech Server, which is located on the local machine, to use the training text specified in the Langmodel
list and the ENUK-tel
language pack to build a new language model and dictionary file, both named mymodel
. This action also calculates a recommended interpolation weight at the end of the language model building process.
The interpolation weight is only a suggested weight–you can choose to set other weights.
The new language models are placed in the custom language models folder, as specified in the HPE IDOL Speech Server configuration file.
|