Release Notes

v1.3.1

Features and Enhancements

BYO Knowledge Tool

The BYO Knowledge tool allows enterprises to import private knowledge and use it as a dedicated, searchable knowledge source during question answering. This helps teams provide responses based on internal documents, operational knowledge, and organization-specific context.

Multi-Cluster Support

Multi-cluster support enables users to access information from multiple clusters by cluster name, expanding question-answering capabilities across cluster boundaries. This makes it easier to query and compare resources in different cluster environments.

Token Quota Limits

Token quota limits allow request frequency and token usage to be restricted by user. This helps administrators control costs, manage quotas, and prevent excessive consumption of model resources.

History

History support enables users to review previous conversations and question-answering results. This makes it easier to trace context, continue earlier investigations, and troubleshoot issues based on past interactions.

Improvements

  • Optimized the RAG (langchain) and reranking process to significantly improve answer accuracy and relevance.
  • Upgraded the core AI framework to LangChain 1.0 to stay compatible with latest features and optimizations.
  • Added routine system check prompts and performed comprehensive code polishing and unit test linting.
  • Separated databases for system knowledge base, user knowledge base, and chat history to improve data isolation and performance.
  • Redesigned the Smart Doc interaction page for a more intuitive and efficient user experience.
  • Upgraded the MCP server, adding support for OAuth authentication and writable tool configurations.
  • Enhanced file upload integration for a smoother knowledge ingestion process.
  • Added support for IDs in custom elements and resolved related data redundancy issues.
  • Implemented a Redis-based rate limiter to enhance system stability and manage API traffic.

Bug Fixes

  • Fixed an issue where model downloading could fail and improved environment variable configuration for embedding models.
  • Resolved data processing errors occurring during the merging and unpacking of update values.
  • Fixed a bug that caused redundant data prefixes in custom elements.
  • Resolved occasional service call failures to improve overall system reliability.

v1.2.1

NOTE: Agent mode is an experimental feature, please use with caution.

Bug Fixes

  • Fixed an issue where setting the knowledge database name may not work. This fix adds an option to set the database dump file name during installation, and automatically use the specified database dump file to initialize the knowledge base.
  • Fixed an issue where MCP tools can create or delete K8s resources without human confirmation in Agent mode.
  • Fixed an issue when asking for disk space information using Agent mode, the server may get stuck.
  • Fixed an issue when deploying on ACP 4.2 or above, the default node taints are not handled.
  • Fixed a deployment error: kubeVersion: >=1.20.0 which is incompatible with Kubernetes v1.33.7-1.
  • Fixed an issue where the API key for LLM service and rerank service appeared in plain text when deploying.

Improvements

  • Improved the prompt for correct Hyperflux identity.
  • Removed not used configuration items in the installation page.

v1.2.0

Features and Enhancements

  • Default using RAG chain to answer user questions, improving answer accuracy.
  • Support importing database dump to initialize knowledge base, simplifying the setup process.
  • Experimental: Support enabling Agent mode to leverage MCP tools to retrieve real-time cluster information.
  • Support connecting to PGVector database deployed outside the Alauda Hyperflux installation.
  • Support Cohere Reranker model to improve answer relevance.
  • Support setting RAG chain parameters such as total_search_k etc.

Known Issues

  • When LLM returns errors, the answer generation may fail. When come back to view the chat history, will send the question again to LLM, causing duplicated conversations.