Knowledge
Introduction
The Knowledge module enhances agents by integrating information from external sources, such as product catalogs, research reports, user manuals, and documents provided by your users. Serenity* Star automatically analyzes and segments these documents, creates and stores embeddings, and uses vector search to retrieve relevant content to answer user queries.
You can add knowledge to your agent using two distinct sources: files and websites. Whether you are incorporating detailed documents or specific online content, Serenity* Star processes these sources to enrich your agent's ability to provide accurate and contextually relevant responses.
How to Use
In this example, we'll enhance an existing Assistant Agent with knowledge so that it can effectively answer questions about the 2024 Olympic Games. If you need guidance on creating an Assistant Agent, refer to this guide.
Add Knowledge
-
Navigate to the Knowledge tab.
-
Click the "Add Knowledge" button.
-
Select your knowledge source. You can choose Websites or Files.
Using Websites as Knowledge Source
To use websites as a knowledge source, press the "Add Knowledge" button in the Knowledge tab and select the Websites option. If you need further guidance, follow these instructions described above.
From this section, you can add one or multiple URLs at once. After adding the URLs, click the confirm button.
In a few moments, the websites will appear in the grid and start processing. Just wait a few seconds for the processing to finish, and the new knowledge will be available for the agent.
File Downloading
When a website is processed, any downloadable files found will be automatically retrieved and added as new knowledge sources. This ensures that all relevant content, including PDFs, documents, and other downloadable resources, is incorporated into the agent's knowledge base, enhancing the quality of its responses.
Tips
- Use Accessible URLs: Make sure the website URL you provide is accessible from the internet, not restricted to private or intranet networks
- Select Focused Sites: Choose websites with focused content, such as blogs, articles, or company websites, rather than large, complex sites like public forums or social media platforms
- Verify Content Quality: Ensure the website contains high-quality, relevant information that aligns with the knowledge you want to extract
- Avoid Dynamic Content: Avoid sites with rapidly changing content, such as news sites or live feeds, to maintain consistency in the extracted knowledge
Using Files as Knowledge Source
To use files as a knowledge source, press the "Add Knowledge" button in the Knowledge tab and select the Files option. If you need further guidance, follow these instructions described above.
From this section, you can upload one or multiple files at once. After uploading the files, click the confirm button.
In a few moments, the files will appear in the grid and start processing. Just wait a few seconds for the processing to complete, and the new knowledge will be available for the agent.
Wait for Processing
After you have uploaded files or added URLs and saved the changes, Serenity* Star will automatically begin processing the data. The processing involves the following steps:
- 🔍 Extract content.
- 📂 Files: Scanning the content of the uploaded files.
- 🌐 Websites: Scraping the content from the provided URLs.
- ✂️ Segment the content into sections.
- ⚙️ Generate the embeddings.
You can track the status of each knowledge source from the "File Status" grid in the Knowledge tab. Once processing is complete, the status will change to Available
, indicating that the knowledge is ready for the agent to use.
Sections
A "section" is a discrete part of content extracted from a knowledge source. Segmenting content into smaller sections is crucial for optimizing token usage, as it helps the agent stay within token limits while maintaining response accuracy. When a knowledge source is processed, Serenity* Star automatically divides it into sections based on content structure, such as paragraphs or logical groupings. This enables the agent to find and use the most relevant information, delivering precise, contextually accurate answers while making efficient use of resources.
Custom Segmentation
The Custom Segmentation feature allows you to define how content is divided into sections, offering greater flexibility in how knowledge is processed from file sources. By using a custom delimiter, you can control the segmentation of the content into more meaningful and contextually relevant chunks. To use this feature, insert the following delimiter in the content where you want to define the boundaries of each section:
%%%%%%%----////////\\\\----%%%%%%%
This delimiter indicates where Serenity* Star should break the content into sections. It’s especially useful for long or complex documents where the default segmentation may not be sufficient. Place the delimiter at logical points in the text, such as between topics or sections, to help the agent process the knowledge effectively. Custom segmentation ensures that the extracted knowledge is relevant and precise, improving the agent’s ability to deliver accurate information.
Text Overlap
The Text Overlap feature allows you to define how much content is shared between adjacent sections, improving the accuracy and relevance of the information retrieved by the agent. This is achieved by including overlapping content at the boundaries of each section, ensuring that no important context is lost during segmentation.
Text overlap is divided into three parts:
- Pre-text: Content that appears before the section, providing context from the previous section.
- Text: The core content of the section, representing the primary information.
- Post-text: Content that appears after the section, offering context for the following section.
By including pre-text and post-text in each section, you ensure that each segment retains enough context to maintain the flow and coherence of the knowledge, which helps the agent deliver more accurate and contextually relevant answers.
Manage Sections
Once the knowledge source has been successfully processed and is in the Available
state, the "Show Sections" button will become visible. You can use this feature to view, edit, or delete specific sections of your knowledge source to correct or remove any unnecessary information.
In the section editor, you can modify not only the core content of each section (text) but also the pre-text and post-text surrounding each section. This allows you to fine-tune the context and relevance of the knowledge, ensuring the agent retrieves the most accurate and contextually relevant information.
Click the "Show Sections" button on the row to access the knowledge editor.
In the editor, all sections of the knowledge source will be listed, allowing you to view, delete, or modify them.
You can visualize and edit the pre-text, text, and post-text overlaps between adjacent sections.
Any sections you modify, as well as the knowledge source they belong to, will be flagged to track changes.
Reprocess
If a website you've added as a knowledge source has been updated and you want these changes to be reflected in the agent’s responses, or if you’ve made modifications to the sections of a website or file and wish to restore the original content, you can reprocess the knowledge source.
You can reprocess a knowledge source in two ways:
- Immediate Reprocessing: For websites and modified files, you can reprocess the knowledge source right away.
- Scheduled Reprocessing: You can schedule periodic reprocessing for websites, ensuring that the knowledge stays up-to-date automatically without manual intervention.
The reprocess button will only be visible when the knowledge source meets one of the following conditions:
- It's a website and it's currently in the available state.
- It's a file and it's sections have been modified.
- It's a website or file and it's currently in the error state.
To reprocess a knowledge source, click the dropdown button next to the file or website you wish to reprocess. From the dropdown menu, select the "Reprocess" option. This will prompt you to confirm the reprocessing.
Configure Reprocessing
You can configure the automatic reprocessing of websites to ensure that the agent's knowledge stays up-to-date. To do this:
-
Select the website from the list of knowledge sources.
-
Click on the Configure reprocessing button next to the website.
-
In the reprocess sidepanel, click the option Schedule Reprocessing.
-
Configure the reprocessing
(1) Active: Toggle whether the scheduled reprocessing is active or inactive.
(2) Base Agent Version: Choose the agent version that will serve as the basis for generating a new version with the reprocessed website content.
(3) Publish: Decide whether the newly generated agent version should be published immediately.
(4) Frequency: Set the frequency for reprocessing the website
-
Add the desired frequency for reprocessing (e.g., daily, weekly, or monthly).
-
Save your schedule reprocess configuration.
This will schedule the reprocessing of the selected website at regular intervals, ensuring that the agent always has access to the most current content without manual intervention.
Versioning
A knowledge source is always linked to an agent version. Therefore, a draft version of the agent may be automatically created if one doesn't already exist. This occurs in the following scenarios:
- When a new knowledge source is added
- When a knowledge source is reprocessed
- When a section of a knowledge source is edited or deleted
If changes are not saved, they will remain only in the automatically generated draft version of the agent.
Delete
To delete a knowledge source, click the dropdown button next to the file or website you wish to delete. From the dropdown menu, select the "Delete" option. This action will prompt you to confirm the deletion.
Once you confirm, the deletion process begins. Serenity* Star will first delete the generated embeddings associated with the file. After the embeddings are removed, the file itself will be deleted from the system. This two-step process ensures that you have a chance to review and confirm your decision before the file and its associated data are permanently removed.
Test your agent with the new knowledge
Use the Agent Designer's preview pane to interact with the agent and check that it is using the knowledge correctly.
Advanced Settings
Within the advanced settings section, you can modify the following parameters.
-
Embedding Model: Model used to generate the vectors.
You must have previously configured the API Key of the AI model vendor to use it.
-
Relevance: A decimal value between 0 and 1 that defines how similar the results should be when the agent searches for relevant information.
-
Limit Sections to Retrieve: An integer value between 1 and 20 that sets the maximum number of sections that can be obtained from the files.
Supported Files
FILE FORMAT | MIME TYPE |
---|---|
.txt | text/plain |
application/pdf | |
.doc | application/msword |
.docx | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
.csv | text/csv |
.md | text/markdown |