After you prepare the adaptation data set, you can present it to HPE IDOL Speech Server to adapt the acoustic model.
If you have only small amounts of adaptation data (minutes rather than hours), HPE recommends that you run the AmTrain
task in rapid adaptation mode. The standard acoustic adaptation process requires the training data to contain examples of everything to be updated, which can be difficult to obtain. Rapid adaptation mode applies transformations to the entire model. When more examples become available, HPE IDOL Speech Server refines the process to apply different transformations to individual base allophones.
To enable rapid adaptation mode, set the MLLRMaxMins
configuration parameter in the amadaptadddata
module. For more information about this configuration parameter, see the HPE IDOL Speech Server Reference.
The adaptation process can produce label files and a diagnostics file. Both types of files contain details of the word alignments generated during adaptation, but label files use a format that is compatible with some third-party applications.
To adapt an acoustic model
Create a list that contains the file names (excluding file extensions and paths) of all adaptation files (see Data Naming Scheme).
For more information about HPE IDOL Speech Server's list manager, see Create and Manage Lists.
Send an AddTask
action to HPE IDOL Speech Server, and set the following parameters:
Type
|
The task name. Set to AmTrain . |
Am
|
The original acoustic model to be adapted. |
DataList
|
The list that specifies the adaptation files. |
Out
|
The name of the adaptation accumulator (.acc) file to produce. |
Pgf
|
The pronunciation generation file (.pgf file) included in the language pack resource. |
PlhPath
|
The path to the directory that contains the audio feature files. |
TxtPath
|
The path to the directory that contains the aligned transcription .ctm files. |
To generate a diagnostics file, set the following parameters:
Diag
|
Whether to generate a diagnostics file. |
DiagFile
|
The name of the diagnostics file to create. |
To generate label files containing word alignment information, set the following parameter:
WriteOutLabs
|
Whether to create label files. Set to True . |
You can set additional parameters. For details of the optional parameters, see the HPE IDOL Speech Server Reference.
For example:
http://localhost:13000/action=AddTask&Type=AmTrain&Am=C:\LP\ENUK\ver-ENUK-5.0-16k.am&Pgf=C:\LP\ENUK\ver-ENUK-5.0.pgf&DataList=ListManager/OptList&PlhPath=C:\data\PLH&TxtPath=C:\data\transcripts&Out=AmAcc.acc
This action uses port 13000
to instruct HPE IDOL Speech Server, which is located on the local machine, to produce the AmAcc
accumulator file using the ver-ENUK-5.0-16k
acoustic model, ver-ENUK-5.0
pronunciation generation file, audio feature files stored in C:\data\PLH
, and transcription files stored in C:\data\transcripts
.
This action returns a token. You can use the token to:
A single AmTrain
action can process the entire adaptation data set because the DataList
parameter specifies a list that can specify the full set. However, if you send a single action, HPE IDOL Speech Server processes each file in series even though you might have several task managers configured in the server.
To run the task in parallel across several task managers, split the list into smaller lists and submit a separate AmTrain
task action for each list. HPE IDOL Speech Server produces an accumulator file for each list.
To further configure this stage of the adaptation process, set the parameters in the [amadaptadddata]
module section of the HPE IDOL Speech Server tasks configuration file. For information about the available parameters, see the HPE IDOL Speech Server Reference.
|