The HPE IDOL Speech Server tasks configuration file (speechserver-tasks.cfg
) contains the following sections.
[TaskTypes]
|
[Resources]
|
|
[MyTask]
|
[MyLanguage]
|
|
[ModuleName]
|
[MyFPDB]
|
For details of these sections and the parameters for each section, see the HPE IDOL Speech Server Reference. The following sections describe the general configuration sections.
The [TaskTypes]
section lists the tasks that are configured in the HPE IDOL Speech Server. You must create a [MyTask]
configuration section for each task type listed in the [TaskTypes]
section.
[TaskTypes] // Speech to text 0=WavToText 1=StreamToText 2=TelWavToText 3=StreamToTextMusicFilter
// Speaker cluster processing 8=ClusterSpeech 9=ClusterSpeechTel 10=ClusterSpeechToTextTel // Transcript analysis 11=TranscriptAlign 12=TranscriptCheck 13=Scorer // Language model building 14=LanguageModelBuild 15=TextNorm // Speaker ID 16=SpkIdFeature 17=SpkIdTrain 18=SpkIdTrainWav 19=SpkIdTrainStream 20=SpkIdDevel 21=SpkIdDevelWav 22=SpkIdDevelStream 23=SpkIdDevelFinal 24=SpkIdEvalWav 25=SpkIdEvalStream 26=SpkIdSetAdd 27=SpkIdSetDelete 28=SpkIdSetInfo
The [MyTask]
sections define configuration options for each HPE IDOL Speech Server audio processing task. You must create a [MyTask]
section for each task you have listed in the [TaskTypes]
section.
Each section contains details of the schema you use as well as any other parameters required for the task.
[WavToText] 0 = a,ts <- wav(MONO, input) 1 = f <- frontend(_, a) 2 = nf <- normalizer(_, f) 3 = w <- stt(_, nf) 4 = output <- wout(_, w,ts) DefaultResults=Out
[StreamToText] 0 = a <- stream1(MONO, input) 1 = f <- frontend1(_, a) 2 = nf <- normalizer1(_, f) 3 = w <- stt1(_, nf) 4 = output <- wout1(_, w) DefaultResults=Out
[TranscriptAlign] 0 = w <- ctm2(READ, input) 1 = w2 <- align2(ALIGN, w) 2 = output <- wout2(_, w2) DefaultResults=Out
The [ModuleName]
configuration sections contain settings for the modules. Create a configuration section for each module that you use in the [MyTask]
configuration sections. Each configuration section must have the same name as the module referenced in the task schemas. If you use more than one configuration of a module, create a section for each configuration, including any numerical suffixes.
You can set configuration parameters in the individual module configuration sections to variable values. You can use these values to create action parameters that allow you to specify the value of the configuration parameter when you create a task. You can refer the values of all similar configuration parameters to a single configuration parameter where you set a standard value. For details, see Configure Variable Parameters.
[stream] SampleFrequency=$stt.Lang.SampleFrequency Mode=$stt.params.mode
[wav] WavFile=$params.File SampleFrequency=$stt.lang.SampleFrequency
The [Resources]
section lists the resources that HPE IDOL Speech Server requires, including language packs and AFP databases. You must create a [MyLanguage]
configuration section for each language pack, and a [MyFPDB]
configuration section for each Audio Fingerprint database listed in the [Resources]
section.
[Resources] 0=ENUK 1=ENUS 2=fpdb:AFP 3=fpdb:ADVERTS 4=FRFR 5=DEDE 6=ARMSA 7=fpdb:AFP
The [MyLanguage]
sections contains settings for language packs that you have defined in the [Resources]
section. You must create a [MyLanguage]
section for each language that you have listed in the [Resources]
section.
[ENUK] PackDir=ENUK Pack=ENUK-6.3 SampleFrequency=16000 AmFile = T:\LP\ENUK\ver-ENUK-5.0-16k.am CustomLM=$params.CustomLM CustomDct=myDictionary.dct.sz DnnFile = $params.DnnFile ClassWordFile = $params.ClassWordFile PronFile = $params.PronFile
[ENUS] PackDir=ENUS Pack=ENUS-6.3 SampleFrequency=16000 AmFile = T:\LP\ENUK\ver-ENUK-5.0-16k.am CustomLM=$params.CustomLM CustomDct=myDictionary.dct.sz DnnFile = $params.DnnFile
[FRFR] PackDir=FRFR Pack=FRFR-6.3 SampleFrequency=16000 AmFile = T:\LP\ENUK\ver-ENUK-5.0-16k.am CustomLM=$params.CustomLM CustomDct=myDictionary.dct.sz DnnFile = $params.DnnFile
The Audio Fingerprint database (fpdb
) configuration sections contain settings for the databases used in HPE IDOL Speech Server.
You can set configuration parameters in these sections to variable values. You can use these values to create action parameters that allow you to specify the value of the configuration parameter when you create a task. For example, the following database configuration allows you to specify which database (the directory it is in, and the base file name of the database) on the command line (using the PackDir
and Pack
parameters):
[AFPDatabase] PackDir = $params.packdir Pack = $params.pack FxxCacheSize=2 TtxCacheSize=200
Alternatively you can explicitly set these values in the configuration file, and specify a particular database:
[ADVERTS] PackDir = C:\databases Pack = adverts FxxCacheSize=2 TtxCacheSize=200
You must list all Audio Fingerprint database resources in the [Resources]
section before you use them. In this list, prefix the resource name with fpdb:
.
The speaker identification base pack (sidbase
) configuration sections contain details of the sid base pack that you want to use for speaker identification. This resource contains details of all the speaker identification base files. If you configure a base pack and set the SpkIdBasePack
configuration parameter in the speaker identification modules, HPE IDOL Speech Server can automatically find the base files for the speaker identification tasks, and you do not have to specify the base files explicitly.
You must configure the directory and version number for the base pack. For example:
[SIDBASE] PackDir = SpeakerIdPack Pack = gen-1.8
In this case, the PackDir
is relative to the SpeakerID
global directory, which is configured in the SpeakerIDDir
configuration parameter. If you have not configured a SpeakerID
global directory, the directory is relative to the main server install directory.
You must list the speaker identification base pack resource in the [Resources]
section before you use it. In this list, you must prefix the resource name with sidbase:
.
|