推理增强生成修复RAG的方法与实践

2025年02月18日由 alex 发表 4769 0

为什么RAG存在缺陷——以及ReAG如何修复它

检索增强生成（RAG）曾承诺带来更智能的AI，但其缺陷正在阻碍我们的发展。以下是为什么推理增强生成（ReAG）是我们需要的升级。

传统RAG的问题

传统的RAG系统就像记忆力不佳的图书管理员：

语义搜索不够智能：它们根据表面相似性检索文档（例如，将“空气污染”与“汽车排放”匹配），但会错过上下文相关的内容（例如，一篇题为“城市肺部疾病趋势”的研究）。
基础设施噩梦：分块、嵌入和向量数据库增加了复杂性。每一步都存在风险，如索引过时或拆分不匹配。
静态知识：更新索引文档很慢——对于数据每天都在变化的医学或金融等领域来说，这简直是致命打击。

ReAG登场：让模型进行推理，而不仅仅是检索

ReAG完全跳过了RAG流程。它不是将文档预处理成可搜索的片段，而是直接将原材料（文本文件、电子表格、URL）输入到语言模型中。然后，大型语言模型（LLM）会：

1. 阅读整篇文档：无需分块或嵌入——保留完整上下文。

2. 提出两个问题：

“这篇文档有用吗？”（相关性检查）
“哪些具体部分重要？”（内容提取）

合成答案：像人类研究人员一样结合见解，即使关键词不匹配也能连接起来25。

ReAG的工作原理：技术解析

原始文档摄入：

无需预处理——文档按原样摄入（Markdown、PDF、URL）

并行LLM分析：

每篇文档同时进行相关性检查和内容提取。

动态合成：

过滤掉不相关的文档；验证过的内容用于生成答案。

ReAG为何胜出：优势与权衡

优势

处理动态数据：实时新闻、实时市场动态或不断发展的研究？ReAG即时处理更新——无需重新嵌入。
解决复杂查询：像“2008年后的监管如何影响社区银行？”这样的问题需要拼凑不同的来源。ReAG比RAG更好地推断间接联系。
多模态掌握：同时分析图表、表格和文本——无需额外预处理。

权衡

成本更高：通过ReAG处理100篇文档意味着100次LLM调用，而RAG的向量搜索成本较低。
大规模时较慢：对于包含数百万篇文档的数据集，混合方法（RAG + ReAG）可能效果更好。

支持ReAG的技术栈

组件解析

1. GROQ + Llama-3.3–70B-Versatile

作用：相关性评估（第一阶段过滤）

为何出众：

通过GROQ的LPU架构实现超快推理（500+个标记/秒）
700亿参数能够细致评分文档相关性，即使对于间接查询也是如此。
大上下文窗口128K个标记

示例：将一篇题为“海冰的热动力学”的气候报告标记为与“北极熊数量减少”相关，尽管没有关键词重叠。

2. Ollama + DeepSeek-R1:14B

作用：响应合成（第二阶段推理）

为何出众：

轻量级、成本高效的140亿模型，经过微调用于提取/总结。
通过Ollama在本地运行，确保数据隐私并降低云成本。
大上下文窗口128K个标记

示例：从标记的文档中提取“自2010年以来，无冰期觅食窗口减少了22%”。

3. LangChain

作用：编排与工作流程自动化

关键功能：

并行化GROQ（相关性）和Ollama（合成）任务。
管理文档路由、错误处理和输出聚合。

为何这个技术栈有效

成本效率：将繁重任务卸载到GROQ的硬件优化API，而Ollama在本地处理轻量级任务。
可扩展性：GROQ的LPU处理数千个并发文档评估。
灵活性：无需重写管道即可切换模型（例如，用Mistral替换Ollama）。

ReAG的代码实现

安装所需的依赖项。

!pip install langchain langchain_groq langchain_ollama langchain_community pymupdf pypdf

下载数据

!mkdir ./data
!mkdir ./chunk_caches
!wget "https://www.binasss.sa.cr/int23/8.pdf" -O "./data/fibromyalgia.pdf"

设置大型语言模型（LLM）

from langchain_groq import ChatGroq
from langchain_ollama import ChatOllama
import os
os.environ["GROQ_API_KEY"] = "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"
#
llm_relevancy = ChatGroq(
     model="llama-3.3-70b-versatile",
    temperature=0,)
#
llm = ChatOllama(model="deepseek-r1:14b",
                 temperature=0.6,
                 max_tokens=3000,
                )

定义系统提示

REAG_SYSTEM_PROMPT = """
# Role and Objective
You are an intelligent knowledge retrieval assistant. Your task is to analyze provided documents or URLs to extract the most relevant information for user queries.
# Instructions
1. Analyze the user's query carefully to identify key concepts and requirements.
2. Search through the provided sources for relevant information and output the relevant parts in the 'content' field.
3. If you cannot find the necessary information in the documents, return 'isIrrelevant: true', otherwise return 'isIrrelevant: false'.
# Constraints
- Do not make assumptions beyond available data
- Clearly indicate if relevant information is not found
- Maintain objectivity in source selection
"""

定义检索增强生成（RAG）提示

rag_prompt = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:
"""

定义响应模式

from pydantic import BaseModel,Field
from typing import List
from langchain_core.output_parsers import JsonOutputParser
class ResponseSchema(BaseModel):
    content: str = Field(...,description="The page content of the document that is relevant or sufficient to answer the question asked")
    reasoning: str = Field(...,description="The reasoning for selecting The page content with respect to the question asked")
    is_irrelevant: bool = Field(...,description="Specify 'True' if the content in the document is not sufficient or relevant to answer the question asked otherwise specify 'False' if the context or page content is relevant to answer the question asked")

class RelevancySchemaMessage(BaseModel):
    source: ResponseSchema
relevancy_parser = JsonOutputParser(pydantic_object=RelevancySchemaMessage)

加载并处理输入文档

from langchain_community.document_loaders import PyMuPDFLoader
file_path = "./data/fibromyalgia.pdf"
loader = PyMuPDFLoader(file_path)
#
docs = loader.load()
print(len(docs))
print(docs[0].metadata)

8
{'producer': 'Acrobat Distiller 6.0 for Windows',
 'creator': 'Elsevier',
 'creationdate': '2023-01-20T09:25:19-06:00',
 'source': './data/fibromyalgia.pdf',
 'file_path': './data/fibromyalgia.pdf',
 'total_pages': 8,
 'format': 'PDF 1.7',
 'title': 'Fibromyalgia: Diagnosis and Management',
 'author': 'Bradford T. Winslow MD',
 'subject': 'American Family Physician, 107 (2023) 137-144',
 'keywords': '',
 'moddate': '2023-02-27T15:02:12+05:30',
 'trapped': '',
 'modDate': "D:20230227150212+05'30'",
 'creationDate': "D:20230120092519-06'00'",
 'page': 0}

辅助函数，用于格式化文档

from langchain.schema import Document
def format_doc(doc: Document) -> str:
    return f"Document_Title: {doc.metadata['title']}\nPage: {doc.metadata['page']}\nContent: {doc.page_content}"

辅助函数，用于提取相关上下文

### Helper function to extract relevant context
from langchain_core.prompts import PromptTemplate
def extract_relevant_context(question,documents):
    result = []
    for doc in documents:
        formatted_documents = format_doc(doc)
        system = f"{REAG_SYSTEM_PROMPT}\n\n# Available source\n\n{formatted_documents}"
        prompt = f"""Determine if the 'Avaiable source' content supplied is sufficient and relevant to ANSWER the QUESTION asked.
        QUESTION: {question}
        #INSTRUCTIONS TO FOLLOW
        1. Analyze the context provided thoroughly to check its relevancy to help formulizing a response for the QUESTION asked.
        2, STRICTLY PROVIDE THE RESPONSE IN A JSON STRUCTURE AS DESCRIBED BELOW:
            ```json
               {{"content":<<The page content of the document that is relevant or sufficient to answer the question asked>>,
                 "reasoning":<<The reasoning for selecting The page content with respect to the question asked>>,
                 "is_irrelevant":<<Specify 'True' if the content in the document is not sufficient or relevant.Specify 'False' if the page content is sufficient to answer the QUESTION>>
                 }}
            ```
         """
        messages =[ {"role": "system", "content": system},
                       {"role": "user", "content": prompt},
                    ]
        response = llm_relevancy.invoke(messages)    
        print(response.content)
        formatted_response = relevancy_parser.parse(response.content)
        result.append(formatted_response)
    final_context = []
    for items in result:
        if (items['is_irrelevant'] == False) or ( items['is_irrelevant'] == 'false') or (items['is_irrelevant'] == 'False'):
            final_context.append(items['content'])
    return final_context

调用函数以检索相关上下文

question = "What is Fibromyalgia?"
final_context = extract_relevant_context(question,docs)
print(len(final_context))

辅助函数，用于生成响应

def generate_response(question,final_context):
    prompt = PromptTemplate(template=rag_prompt,
                                     input_variables=["question","context"],)
    chain  = prompt | llm
    response = chain.invoke({"question":question,"context":final_context})
    print(response.content.split("\n\n")[-1])
    return response.content.split("\n\n")[-1]

生成响应

final_response = generate_response(question,final_context)
final_response

#################### Response #################################
'Fibromyalgia is a chronic condition characterized by widespread musculoskeletal pain, fatigue, disrupted sleep, and cognitive difficulties like "fibrofog." It is often associated with heightened sensitivity to pain due to altered nervous system processing. Diagnosis considers symptoms such as long-term pain, fatigue, and sleep issues without underlying inflammation or injury.'

问题2

question =  "What are the causes of Fibromyalgia?"
final_context = extract_relevant_context(question,docs)
final_response = generate_response(question,final_context)
##################################Response ############################
Fibromyalgia likely results from disordered central pain processing leading to heightened sensitivity (hyperalgesia and allodynia). Possible causes include dysfunction of the hypothalamic-pituitary-adrenal axis, inflammation, glial activation, small fiber neuropathy, infections like Epstein-Barr virus or Lyme disease, and a genetic component. Other conditions, such as infections or medication side effects, may also contribute to similar symptoms.

问题3

question =  "Do people suffering from rheumatologic conditions may have fibromyalgia?"
final_context = extract_relevant_context(question,docs)
final_response = generate_response(question,final_context)
############################Response################################
Yes, people with rheumatologic conditions, such as rheumatoid arthritis or psoriatic arthritis, may also have fibromyalgia. This is because they share overlapping symptoms, making diagnosis challenging.

问题4

question =  "Mention the nonpharmacologic treatment for fibromyalgia?"
final_context = extract_relevant_context(question,docs)
final_response = generate_response(question,final_context)
############################RESPONSE#########################
Nonpharmacologic treatments for fibromyalgia include patient education, exercise, and cognitive behavior therapy (CBT).

问题5

question =  "According to 2016 American College of Rheumatology Fibromyalgia what is the Diagnostic Criteria for Fibromyalgia?"
final_context = extract_relevant_context(question,docs)
final_response = generate_response(question,final_context)
###############################RESPONSE#############################
The 2016 American College of Rheumatology diagnostic criteria for fibromyalgia require generalized pain in at least four of five body regions for at least three months. Additionally, patients must meet either a Widespread Pain Index (WPI) score of ≥7 with a Symptom Severity Scale (SSS) score of ≥5 or a WPI score of ≥4 with an SSS score of ≥9. Other disorders that could explain the symptoms must be ruled out.

问题6

question =  "What is the starting dosage of Amitriptyline?"
final_context = extract_relevant_context(question,docs)
final_response = generate_response(question,final_context)
#########################RESPONSE###########################
The starting dosage of Amitriptyline for adults is usually between 25 to 50 mg per day, often beginning with a lower dose of 5 to 10 mg at night to minimize side effects before gradually increasing.

问题7

question = "What has been mentioned about AAPT 2019 Diagnostic Criteria for Fibromyalgia"
final_context = extract_relevant_context(question,docs)
final_response = generate_response(question,final_context)
#########################RESPONSE####################################
The AAPT 2019 criteria for fibromyalgia include multisite pain in at least six of nine specified areas, moderate to severe sleep problems or fatigue, and symptoms lasting three months or more.

问题8

question =  "What are the medications and doses for Fibromyalgia?"
final_context = extract_relevant_context(question,docs)
print(final_context)
final_response = generate_response(question,final_context)
#######################Response##################################
['Duloxetine, milnacipran, pregabalin, and amitriptyline are potentially effective medications for fibromyalgia. Nonsteroidal anti-inflammatory drugs and opioids have not demonstrated benefits for fibromyalgia and have significant limitations.',
 'Amitriptyline, cyclobenzaprine, duloxetine (Cymbalta), milnacipran (Savella), and pregabalin (Lyrica) are effective for pain in fibromyalgia.43,46-48,50,52,54',
 'Amitriptyline (tricyclic antidepressant) - 5 to 10 mg at night, 20 to 30 mg at night. Cyclobenzaprine (muscle relaxant; tricyclic derivative) - 5 to 10 mg at night, 10 to 40 mg daily in 1 to 3 divided doses. Duloxetine (Cymbalta; serotonin-norepinephrine reuptake inhibitor) - 20 to 30 mg every morning, 60 mg every morning. Milnacipran (Savella; serotonin-norepinephrine reuptake inhibitor) - 12.5 mg every morning, 50 mg twice daily. Pregabalin (Lyrica; gabapentinoid) - 25 to 50 mg at bedtime, 150 to 450 mg at bedtime.',
 'Fibromyalgia is often treated with medications such as pregabalin (Lyrica) and duloxetine (Cymbalta). Pregabalin can be started at a dose of 75 mg twice daily, with a maximum dose of 450 mg/day. Duloxetine can be initiated at a dose of 30 mg once daily, with a target dose of 60 mg/day.',
 'Fibromyalgia is often treated with medications such as pregabalin (Lyrica) and duloxetine (Cymbalta). Pregabalin can be started at a dose of 75 mg twice daily, with a maximum dose of 450 mg/day. Duloxetine can be initiated at a dose of 30 mg once daily, with a target dose of 60 mg/day.']

最终回复

print(final_response)
#############################Response############################
The medications commonly used to treat fibromyalgia include:
1. **Amitriptyline**: A tricyclic antidepressant typically taken at night in doses ranging from 5 to 30 mg.
2. **Cyclobenzaprine**: A muscle relaxant and tricyclic derivative, usually administered in doses up to 40 mg daily in divided doses.
3. **Duloxetine (Cymbalta)**: A serotonin-norepinephrine reuptake inhibitor taken in the morning, starting at 20-30 mg and increasing to 60 mg if needed.
4. **Milnacipran (Savella)**: Another serotonin-norepinephrine reuptake inhibitor, starting at 12.5 mg in the morning and potentially increased to 50 mg twice daily.
5. **Pregabalin (Lyrica)**: A gabapentinoid taken at bedtime, beginning with 75 mg twice daily and up to a maximum of 450 mg/day.
These medications are effective for managing pain associated with fibromyalgia. It's important to note that dosages should be adjusted under medical supervision, starting low and increasing as necessary. Additionally, NSAIDs and opioids are not recommended for treating fibromyalgia due to limited effectiveness and potential side effects.

总结

ReAG并不是要取代RAG，而是要重新思考人工智能如何与知识互动。通过将检索视为一种推理任务，ReAG反映了人类研究的方式：全面、细致且以上下文为驱动。

文章来源：https://medium.com/@nayakpplaban/fixing-rag-with-reasoning-augmented-generation-919939045789

标签：

人工智能检索增强生成

0 评论

欢迎关注ATYUN官方公众号

商务合作及内容投稿请联系邮箱:bd@atyun.com

上一篇高效PDF处理：Gemini 2.0 Flash提取与RAG

下一篇如何使用FastAPI部署运行深度学习模型

评论登录

要发表评论，您必须先登录。

jonatasgrosman/wav2vec2-large-xlsr-53-english facebook/dino-vitb16 bert-base-uncased xlm-roberta-large xlm-roberta-base gpt2 microsoft/resnet-50 facebook/dino-vits8

AGENTIC AI如何塑造未来