Query Manipulation Server

Query Manipulation Server is an ACI server. For details of changes that affect all ACI servers, see ACI Server Framework.

24.3.0

New Features

  • The following improvements have been made to the LLM configuration that you can use to generate embeddings or summaries:

    • You can now generate embeddings and summaries by using a Python script, in the same way as for Lua. To use this option, you set the Type parameter to Python in your embedding configuration, or GenerativePython in your generative model configuration. You must also set Script to the path to the Python script to use. You can also set RequirementsFile to the path to your requirements.txt file.

    • You can now use a specific revision of a model from Hugging Face, by setting the new ModelRevision parameter in your model configuration.

    • You can now use a private model from Hugging Face, by setting the new AccessToken parameter in your model configuration.

    • You can now use only an offline (cached) version of your model, rather than downloading the latest, by setting the new OfflineMode parameter in your model configuration.

    • You can now use an alternative algorithm to generate summaries, by setting the new GreedySearch parameter to False. By default, QMS uses the greedy search algorithm, which uses the token with the highest probability as the next token. When you set GreedySearch to False, it uses a multinomial sampling algorithm to choose a random token based on a probability distribution, which can give better results for long sentences.

    • You can now improve the quality of summaries by excluding summaries from final chunks that are too small. When QMS performs summarization, it breaks the input text up into chunks based on your configured ChunkSize. QMS then generates summaries for each chunk and combines the results. When the final chunk is small, it can result in a poor quality summary. You can use the new MinFinalSummaryChunkSize configuration parameter in your generative model configuration to set a minimum chunk size for this final chunk. QMS does not smaller chunks for summarization.

    • You can now set a limit on the amount of data to generate embeddings for, by setting the new DataLimit parameter in your model configuration.

    For more information, refer to the QMS Help

Resolved Issues

  • Using generative models with small input (less than 3 tokens for summarization) caused an error.

  • QMS did not handle the MaxDistance parameter correctly in the vector field configuration.

Notes

  • The QMS documentation listed the AllowedQueryParams configuration parameter, which was a deprecated alias for AllowedQueryParameters. The documentation now reflects the current name for the parameter. AllowedQueryParams is still available as an alias, but configuration validation marks this name as deprecated.

24.2.0

New Features

  • The method for using LLMs to generate embeddings and summaries has been improved.

    Rather than create your own model files by using a script, QMS can now download and cache the models directly from Hugging Face. This change means that QMS supports a wider range of models, and it does not require the python script or external python libraries.

    To use LLMs, you must now use the ModelName parameter in your LLM configuration to specify the model to use from Hugging Face. You can also optionally set CacheDirectory to the location that QMS must use to store the cached model files.

    For information about the models that you can use to generate embeddings or summaries, refer to the QMS Help.

    IMPORTANT: As part of this change, the ModelPath and TokenizerPath parameters have been removed, and are no longer supported.

  • You can now use QMS to convert a text query into a vector query. To use this option, you must configure the new [VectorFields] section in QMS to define the IDOL Content component fields that contain the vectors that you want to search. You can then use the new VectorConfig parameter on your queries to convert query text into a vector search. You can optionally also include the QueryType parameter to send both conceptual text query and a vector query simultaneously. For example:

    action=Query&Text=The quick brown fox jumps over the lazy dog&VectorConfig=MyVectorA&QueryType=conceptual,vector

    QMS converts the query text to an embedding and sends a query of the following form to Content:

    (The quick brown fox jumps over the lazy dog) OR VECTOR{embeddingdata}:VECTORA

    You can also query multiple vector fields by using a comma-separated list. For example:

    action=Query&Text=cat&VectorConfig=MyVectorA,MyVectorB
    VECTOR{embeddingdataA}:VECTORA OR VECTOR{embeddingdataB}:VECTORB

    For more information, refer to the QMS Help.

  • You can now control the precision of the embeddings that QMS generates, by using the EmbeddingPrecision parameter in the configuration section for your embedding system. This parameter sets the number of decimal places to use in the embedding values.

  • You can now configure embedding models and generative LLMs to use a CUDA-compatible GPU device, by setting the new Device configuration parameter in the model configuration.

Resolved Issues

There were no resolved issues in this release.

24.1.0

New Features

  • The new ModelSummarize action has been added. This action allows you to use a third-party vector model to summarize a document or set of documents. This option requires you to configure a generative model in the [Generative] configuration section. For more information, refer to the QMS Help.

  • The vector embeddings that the vector generator returns now include offset information. Each embedding has the start and end byte offsets of the chunk of text that corresponds to the embedding, and the length of the chunk, in bytes.

    This change is always on for the SentenceTransformer module. For the Lua module, you can choose whether to return the offset information from the generateembeddings Lua function. To return offsets, the generateembeddings must return two tables; the first is the embeddings, and the second is the offsets.

    The ModelEncode action returns the offset information as attributes of the vector in the response.

  • The new configuration parameters ModelSequenceOverlap and ModelMinimumFinalSequenceLength have been added to allow you to control how QMS splits the text used to generate embeddings when multiple embeddings are required. For more information, refer to the QMS Help.

Resolved Issues

  • When processing query texts with synonym rules (that is, setting ExpandQuery to True), QMS removed terms from the query text that were followed by a colon.

23.4.0

New Features

  • QMS can now provide a spell checked version of original query text, rather than the expanded query, when you send a query with the ExpandQuery and SpellCheck parameters set to True. Previously, QMS returned a corrected version of the expanded query.

    To use this feature, you must set the SpellCheckShowOriginal parameter to True in the Content component configuration.

  • You can now configure QMS to create embeddings to use in your IDOL Content component index for vector searches. The new [Embeddings] configuration section allows you to set the location of your model files. You then generate the embeddings by using the new ModelEncode action.

    [Embeddings]
    0=SentenceTransformer [SentenceTransformer] Type=Transformer ModelPath=C:\modelfiles\model.pt TokenizerPath=C:\modelfiles\tokenizer.spiece.model ModelMaxSequenceLength=128
    action=ModelEncode&Model=SentenceTransformer&Text=my%20text

    For more information, refer to the QMS Help.

Resolved Issues

There were no resolved issues in this release.

23.3.0

New in this Release

There were no new features in this release.

Resolved Issues

  • When using QMS through IDOL Admin, the ExpandNames parameter did not work, because the responseformat=json parameter was incorrectly passed from IDOL Admin through to the name variant library.

  • When IntentRankedQuery was activated, if Content did not return any results, QMS incorrectly returned a BADPARAMETER error.

23.2.0

New in this Release

  • QMS can now expand names in query text to include other variants of the same name, such as variants that use initials or titles, nicknames, phonetically similar names, and translations.

    NOTE: This feature requires the IDOL Eduction combined_names.ecr grammar and pii_postprocessing.lua script, which you must obtain from the IDOL Eduction Grammars package.

    To use the new expansions, you must configure the locations of the required Eduction grammar, and some data files that are included in your QMS package.

    [NameVariants]
    GrammarDirectory=grammars/pii
    DataDirectory=staticdata

    You can then send a query with the new ExpandNames parameter to expand any names in the original query with the matching variants. For example:

    action=Query&Text=John Smith&ExpandNames=True

    For more information, refer to the Query Manipulation Server Help.

Resolved Issues

  • When queries were sent through QMS, the <autn:predicted> tag (indicating whether the reported totalresults value was estimated or exact) always returned with value false (regardless of whether prediction was used or not).