Debugging RAG systems

Feb 25, 2024

In my previous post - https://shchegrikovich.substack.com/p/zoo-of-rags - I described several architectures for building RAG systems. In this post, I want to cover problems in the operation of RAG systems.

The debugging process consists of steps: a problem occurs in the system, a hypothesis is formulated, the hypothesis is checked, and the problem is fixed. The most important step is the formulation of a hypothesis. To create a good hypothesis, we need to understand what can go wrong in the system.

An excellent overview of RAG problems can be found here https://towardsdatascience.com/12-rag-pain-points-and-proposed-solutions-43709939a28c - 12 RAG Pain Points and Proposed Solutions. All problems with RAG can be split into several buckets: Indexing, Searching and Prompting. These buckets correspond to three main stages of the RAG pipeline. To identify these problems, a logging and monitoring infrastructure needs to be built.

Indexing problems related to extracting and chunking information. First, we must ensure we've got all the required information for the application. But even if we have all the information, sometimes it's present in different formats. Parsing complex PDFs, images, and tables can be problematic. Two tools help with observability at this stage: https://github.com/gabrielchua/RAGxplorer and https://huggingface.co/spaces/m-ric/chunk_visualizer. RAGxplorer can answer why specific chunks were returned, whereas chunk_visualizer will help with chunk size and strategies experiments.

Searching issues occur during retrieval and ranking. For the user's request, we need to retrieve the most accurate documents from the index, taking into account the intent of the request. The RAG can not only miss documents in the index but can also fail to rank them appropriately.

In the Prompting category, there are problems with grounding LLMs and creating prompts. Retrieved documents or chunks need to be correctly added to a prompt. Due to the context window limit, we can not include all the documents. There are also problems related to the proper formatting of the output and the fullness of the answer.

To identify problems from the category Searching and Prompting, we can use - https://github.com/Helicone/helicone - an open-source solution for logging, caching, tracking costs and retries. In the simplest case, we can use the ELK stack for logging and monitoring.

Resources:

https://towardsdatascience.com/12-rag-pain-points-and-proposed-solutions-43709939a28c - 12 RAG Pain Points and Proposed Solutions

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots

Visualise & Discover RAG Data

Subscribe for Free Introduction RAGxplorer is a work in progress, and a number of improvements came to mind. But it is a very timely development, considering the necessary focus that data centric AI should enjoy. In the past, I have referred to the four dimensions of data…

2 years ago · 1 like · Cobus Greyling

https://huggingface.co/spaces/m-ric/chunk_visualizer
https://arxiv.org/abs/2401.05856 - Seven Failure Points When Engineering a Retrieval Augmented Generation System

Gradient Flow

Best Practices in Retrieval Augmented Generation

Subscribe • Previous Issues Techniques, Challenges, and Future of Augmented Language Models After attending several conferences in the past month, it's evident that Retrieval Augmented Generation (RAG) has emerged as one of the most popular techniques in AI over the past year, widely adopted by many AI teams. RAG refers to the process of supplementing a l…

2 years ago · 29 likes · Ben Lorica 罗瑞卡

Video - LlamaIndex Sessions: Practical Tips and Tricks for Productionizing RAG (feat. Sisil @ Jasper)
https://github.com/Helicone/helicone - Open-source observability platform for LLMs

Shchegrikovich LLM

Discussion about this post