Identify the Language of Speech

To identify the language of speech

  1. Create a new configuration to send to HPE Media Server with the process action, or open an existing configuration that you want to modify.

  2. In the [Analysis] section, add a new analysis task by setting the AnalysisEngineN parameter. You can give the task any name, for example:

    [Analysis]
    AnalysisEngine0=SpeechLanguageId
  3. Create a new section to contain the settings for the task, and set the following parameters:

    Type The analysis engine to use. Set this parameter to LanguageID.
    Input (Optional) The audio track to process. If you do not specify an input track, HPE Media Server processes the first audio track produced by the ingest engine.
    LanguageIdServers

    The host name and ACI port of an IDOL Speech Server. Separate the host name and port with a colon (for example, speechserver:15000). You can specify multiple IDOL Speech Servers by using a comma-separated list. HPE Media Server can connect to only one IDOL Speech Server at a time, but you can provide multiple servers for failover.

    TIP:

    You can specify a default IDOL Speech Server to use for all language identification tasks by setting the LanguageIdServers parameter in the [Resources] section of the HPE Media Server configuration file.

    LangList (Optional) The list of languages to consider when running language identification. If you know which languages are likely to be present in the media, HPE recommends setting this parameter because restricting the possible languages can increase accuracy and improve performance.
    CumulativeMode

    (Optional, default false) A Boolean that specifies whether to run analysis in cumulative mode.

    If you expect the audio to contain only one language or you want to identify the primary language that is spoken in the audio, set this parameter to true. Media Server outputs results to the result track after analyzing each audio segment but every result is based on analysis of the current segment and all of the previous segments. This mode is not suitable for analyzing continuous streams.

    If you set this parameter to false, HPE Media Server runs analysis in segmented mode. The audio is segmented into fixed-size segments and HPE Media Server returns a language identification result for each segment. Media Server does not consider previous segments when running analysis. You can use this mode to determine if there are multiple languages present in the audio, but this mode is not intended to identify the boundary points where the language changes.

    SegmentSize (Optional, default 15) The amount of audio to analyze as a single segment, in seconds.

    For example:

    [SpeechLanguageId]
    Type=LanguageID
    LanguageIDServers=speechserver:15000
    CumulativeMode=True
    SegmentSize=30

    For more information about the parameters that you can use to configure this task, refer to the HPE Media Server Reference.

  4. Save and close the configuration file. HPE recommends that you save your configuration files in the location specified by the ConfigDirectory parameter.


_HP_HTML5_bannerTitle.htm