DataObfuscation

The DataObfuscation task allows you to process training data audio files and labels in such a way that any sensitive or private information is hidden, so that the data can be used for model training.

Parameters

Parameters Description Required
Type The task name. Set to DataObfuscation. Yes
Am The acoustic model to use for processing. Yes
AudioAnalysisDir The location of the audio analysis output directory. Yes
BeamStep The amount to increase the beam value by on a pass failure, before attempting another pass.  
DataList A list of the files to use for processing. Yes
Diag Whether to generate diagnostic information.  
DiagFile The file to write the diagnostic information to.  
DnnFile The DNN acoustic model file to use for processing. Yes
MaxBeam The maximum beam value at which to attempt the adaptation pass.  
MaxOtdLen The required maximum length of each obfuscated training data file, in seconds.  
MinBeam The minimum beam value at which to attempt the adaptation pass.  
OtdPath The directory to write the obfuscated training data files to. Yes
OutLabExt The label file extension.  
OutLabPath The directory to write label files to. By default, HPE IDOL Speech Server writes the files to the configured temp directory.  
Pgf The pronunciation generation (.pgf) file included in the language pack. Yes
PlhExt The file extension of the input audio feature files.  
PlhPath The path to the directory containing the acoustic feature (.plh) files specified in the DataList. Yes
RandOtd Whether to randomize obfuscated training data.  
TxtExt The file extension of the input transcription files.  
TxtPath The path to the directory containing the transcript (.ctm) files specified in the DataList. Yes
WriteOutLabs Whether to create label files.  

Example

http://localhost:13000/action=AddTask&Type=DataObfuscation&Am=ver-ENUK-tel-6.2-8k.am&Pgf=ver-ENUK-tel-6.2.pgf&DnnFile=ver-ENUK-tel-6.2-8k.dnn&DataList=ListManager/ObfuscList&PlhPath=T:\data\PLH&TxtPath=T:\data\ObfuscTranscripts&OtdPath=T:\ObfuscTraining&AudioAnalysisDir=T:\AudioAnalysis

This action uses port 13000 to instruct HPE IDOL Speech Server, which is located on the local machine, to produce the obfuscated training data (.otd) file, using the ver-ENUK-tel-6.2-8k.am acoustic model file, the ver-ENUK-tel-6.2.pgf pronunciation generation file, and the set of data files specified in the DataList parameter to process the data. HPE IDOL Speech Server writes the obfuscated training data files to the ObfuscTraining directory.


_HP_HTML5_bannerTitle.htm