Answer Server

Answer Server is an ACI server. For details of changes that affect all ACI servers, see ACI Server Framework.

24.2.0

New Features

  • The method for using LLMs to generate embeddings and to perform generative and extractive question answering has been improved.

    Rather than create your own model files by using a script, Answer Server can now download and cache the models directly from Hugging Face. This change means that Answer Server supports a wider range of models, and it does not require the python script or external python libraries.

    To use LLMs, you must now use the ModelName parameter in your LLM configuration to specify the model to use from Hugging Face. You can also optionally set CacheDirectory to the location that Answer Server must use to store the cached model files.

    For information about the models that you can use for different configurations, refer to the Answer Server Help.

    IMPORTANT: As part of this change, the ModelPath and TokenizerPath parameters have been removed, and are no longer supported.

  • You can now control the precision of the embeddings that Answer Server generates to run vector queries, by using the EmbeddingPrecision parameter in the configuration section for your embedding system. This parameter sets the number of decimal places to use in the embedding values.

  • You can now configure embedding models and generative LLMs to use a CUDA-compatible GPU device, by setting the new Device configuration parameter in the model configuration.

Resolved Issues

There were no resolved issues in this release.

24.1.0

New Features

  • You can now configure Passage Extractor systems to use an LLM to extract or generate answers. To configure these systems, you set the Type to PassageExtractorLLM. Like a standard passage extractor, you must configure the location of an IDOL index to use to find answers, and classifier files to describe the types of different questions.

    For LLM passage extractor, you must also configure the location of model and tokenizer files for the LLM to use to generate or extract answers.

    You can also use these models in a Lua script, for example so that you can access an LLM through a HTTP endpoint.

    For example:

    [passageextractorLLM]
    // Data store IDOL
    IdolHost=localhost
    IdolAciport=6002
    Type=PassageExtractorLLM
    // Classifier Files
    ClassifierFile=./passageextractor/classifiertraining/svm_en.dat
    LabelFile=./passageextractor/classifiertraining/labels_en.dat
    // Module to use
    ModuleID=LLMExtractiveQuestionAnswering-Small
    
    [LLMExtractiveQuestionAnswering-Small]
    Type=ExtractiveQuestionAnsweringLLM
    ModelPath=modelfiles/model.pt
    TokenizerPath=modelfiles/tokenizer.spiece.model

    For more information, refer to the Answer Server Help.

  • When you use a passage extract LLM system, the Ask action returns a highlighted paragraph in the response metadata to show the passage that the answer was extracted from, to allow you to verify automatically generated answers.

  • You can now configure a Passage Extractor or Passage Extractor LLM system to run vector queries against the IDOL Content component to identify candidate documents that might contain answers to an input question. You can use this option when you index vectors in your IDOL Content component and want to use vector search to retrieve answers.

    To use this option, you must set the AnswerCandidateEmbeddingsSettings parameter in your system configuration section to the name of a configuration section where you configure the Content vector field and an embeddings configuration for how to generate embeddings to send to Content. For example:

    [PassageExtractorSystem]
    idolhost=localhost
    idolaciport=6002
    type=passageextractor
    ...
    AnswerCandidateEmbeddingsSettings=VectorSettings
    
    [VectorSettings]
    EmbeddingsConfig=EmbeddingsSystem
    VectorField=VECTORA
    
    [EmbeddingsSystem]
    Type=Transformer
    ModelPath=path/to/model.pt
    TokenizerPath=path/to/tokenizer.spiece.model
    ModelMaxSequenceLength=128

    For more information, refer to the Answer Server Help.

Resolved Issues

There were no resolved issues in this release.

23.4.0

There were no new features or resolved issues in this release.

23.3.0

There were no new features or resolved issues in this release.

23.2.0

There were no new features or resolved issues in this release.