Menu

Knowledge Base-bot Principles

What is RAG?

The entire set of knowledge base technologies has a professional term in academia called RAG, which stands for Retrieval-Augmented Generation. It is an innovative method that combines information retrieval mechanisms with AI models.

RAG also has a colloquial name, called an external knowledge base. The meaning of "external" is that it is separate from the AI large model. If it's not separate, then it becomes model fine-tuning, which is a different matter altogether.

 

The Principle of RAG

The principle of RAG is actually quite simple. Instead of generating answers directly, the model first refers to a knowledge base before generating a response. In layman's terms, it's the difference between an open-book exam (RAG) and a closed-book exam (direct generation).

From a procedural standpoint, the vast majority of RAG can be divided into these 3 steps:

  1. Creating the Knowledge Base
    1. Input data (upload files)
    2. Process data (slicing + vectorization)
    3. Store data (save to vector database)
  2. Querying the Knowledge Base
    1. Search data (vector retrieval)
    2. Process data (reranking)
    3. Output data (select highly relevant information)
  3. AI Generating Answers

In simple terms, it means breaking down a huge amount of data into many small chunks. When AI performs question-answering, it searches for highly relevant chunks and formulates responses based on the content of these chunks. The advantage of this approach is that AI doesn't need to process large amounts of data; it only needs to handle a small amount, which improves speed and reduces costs.

However, the disadvantage is also obvious: it can lead to taking things out of context. Since AI only receives partial data and cannot see the whole picture, this is an inherent weakness of RAG.

 

The Key to RAG

The essence of RAG is selective extraction. How to segment and select determines the quality of the answer. In the whole process, AI actually plays a minor role. The key lies in the first and second steps, which are data processing and data retrieval.

The key to data processing is how to slice the content. 302.AI provides very rich slicing settings, which can be seen in detail in this article.

The key to data retrieval is how to find highly relevant content. 302.AI adopts a dual retrieval mechanism, first using vector retrieval for coarse retrieval, then using a reranking algorithm for fine retrieval, and finally outputting to the large model, greatly improving accuracy.

 

GraphRAG

GraphRAG is a new RAG technology proposed by Microsoft. Its principle is to use AI to perform graph processing during the data input and processing stage, allowing AI to understand the data and establish semantic connections between data, greatly improving retrieval accuracy.

The essence of GraphRAG is to create new data using AI based on existing data. This new data is achieved by constructing knowledge graphs, which not only helps to better organize and store information but also enables AI to more intelligently identify and reason about relationships between data during retrieval. This method effectively improves the accuracy and efficiency of information retrieval, making the system perform better when handling complex queries. Additionally, GraphRAG's graph processing can help discover hidden patterns and trends in the data, thus providing users with deeper insights and decision support.

GraphRAG is not without drawbacks. First, it can be quite expensive, as AI processing is required during the data input process, which will inevitably incur AI costs. Second, it is slower than traditional RAG because the process of retrieving graphs is much more complex.

302.AI now exclusively offers GraphRAG knowledge base integration and API access, with code developed based on Nano-GraphRAG, making it more lightweight.

 

Summary

The principle of knowledge bases is that AI first queries and then answers. Therefore, how to retrieve highly relevant fragments is the most crucial part of a knowledge base. 302.AI provides two modes: traditional RAG and GraphRAG, which can be chosen according to specific needs.

Previous
Knowledge Base-bot Introduction
Next
Knowledge Base-bot Creation
Last modified: 2024-09-12