利用Gemini函数调用和MongoDB Atlas实现检索增强生成

2024年06月03日由 alex 发表 307 0

在使用 LLM 时，通常会使用检索增强生成（RAG）技术，使模型能够利用特定领域的知识回答问题。其形式通常是将数据存储在运行数据库中，如 MongoDB Atlas，它提供了一个具有集成矢量搜索功能的单一、统一、全面管理的平台。

在典型的 RAG 场景中，用户查询会被转换为嵌入，然后根据矢量索引进行搜索，返回一组语义相似的结果。函数调用使我们能够更进一步，让模型能够委托数据处理任务，并最大限度地发挥我们从 MongoDB Atlas 这样的统一平台中获得的价值，其中包括

以更加对话化和灵活的方式处理用户查询。
优化 API 的使用和成本。
利用与矢量一起存储在 MongoDB 中的操作数据和元数据进行过滤和丰富。

我们将使用Vertex AI Gemini API和带有 MongoDB Atlas 的嵌入式应用程序接口（embeddings API），这使我们能够快速构建和实验，而无需管理任何底层基础设施或部署。

在本文中，我们使用 Python 和 MongoDB sample_mflix.embedded_movies 数据集（来自 Hugging Face）。我们将 plot_embedding 字段替换为使用 Vertex AI 文本嵌入式 API 生成的嵌入式字段。

更多对话式用户查询

在不调用函数的情况下，我们可以使用嵌入字段根据情节搜索电影。如下图所示，用户需要在查询中提供所需的情节。

在生产系统中，这种方法可能会对用户造成一定程度的限制，并且缺乏灵活性，因为用户可能试图：

指定情节以外的查询参数--例如他们最喜欢的流派（流派字段）或演员（演员字段）。
提供意外或不相关的查询，如 “美味素食食谱”。

在后一种情况下，使用意外用户查询的天真向量检索将返回同样意外的结果，如下图所示。

优化API使用和成本

在很多情况下，嵌入整个用户查询并无用处或必要。例如，用户可能会用完整的句子回复聊天机器人，如 “我想推荐威尔-史密斯主演的关于外星访客的电影”。我们只需要嵌入整个查询中的两个单词（“外星来客”），就能与向量索引中的情节嵌入相匹配。

用户查询可能甚至没有指定情节--例如，对于 “威尔-史密斯主演的电影推荐 ”请求，只需与演员字段匹配即可，无需任何嵌入。如果能适当处理此类查询，不仅能改善用户体验，还能通过优化 API 的使用来降低成本。

利用统一的 MongoDB Atlas 平台

在 MongoDB Atlas 中进行矢量搜索时，操作数据、元数据和矢量都存储在一个地方，因此可以简单、高效、一致地访问数据，从而丰富我们的矢量查询。例如，如果用户查询指定了电影类型或演员，我们就可以在矢量搜索中添加预过滤器。

使用 MongoDB Atlas 进行矢量搜索的其他一些优势包括:

无需在我们的操作数据库和单独的矢量存储之间同步数据。
矢量搜索输出可包括来自其他字段的丰富上下文，且始终保持新鲜。
在插入新文档或更新现有文档时，可使用 Atlas 触发器自动更新嵌入。

使用 Gemini 的函数调用

通过函数调用，我们可以提供领域知识（例如对文档结构的理解），使 Gemini 能够访问存储在 MongoDB 中的实时更新的操作数据。首先，我们提供一个函数声明来描述我们的函数，并将此函数作为工具提供给模型：

from vertexai.generative_models import (
    FunctionDeclaration,
    GenerationConfig,
    GenerativeModel,
    Tool,
)
# Function declaration for model to query data in MongoDB with user prompt
search_for_movies_func = FunctionDeclaration(
    name='search_for_movies',
    description='Search for movies by plot, actor and genre.',
    parameters={
        'type': 'object',
        'properties': {
            'plot': {
                'type': 'string',
                'description': 'The plot or location of the movie.'
            },
            'actor': {
                'type': 'string',
                'description': ('The name of an actor in the movie,'
                                ' with word capitalization.')
            },
            'genre': {
                'enum': ['Action', 'Comedy', 'Short', 'Adventure', 
                         'Drama', 'Romance', 'Crime', 'Sci-Fi', 
                         'Musical', 'Family', 'War', 'History', 
                         'Film-Noir', 'Mystery', 'Thriller', 'Biography', 
                         'Fantasy', 'Western', 'Animation', 'Horror', 
                         'Sport', 'Music', 'Documentary'],
                'type': 'string',
                'description': 'The genre of the movie.'
            }
        }
    }
)
# Define a tool that includes the above function declaration
search_for_movies_tool = Tool(
    function_declarations=[search_for_movies_func],
)
# Initialize Gemini model and provide it with the tool
model = GenerativeModel(
    model_name='gemini-1.0-pro-001',
    generation_config=GenerationConfig(
        temperature=0,
        max_output_tokens=2048
    ),
    tools=[search_for_movies_tool],
)

在向模型提供函数声明时，有一些最佳实践需要牢记。你可以在上面的函数声明中看到其中一些：

对函数和参数进行清晰而详细的描述。
根据我们数据集中存在的流派，使用强类型流派参数。
在模型生成配置中使用低温参数（0）。

调用函数后，我们需要将函数输出与原始用户提示一起返回给模型。为了便于重复使用用户提示，我们可以定义一个内容对象来存储它：

from vertexai.generative_models import (
    Content,
    Part,
)
user_prompt_content = Content(
    role='user',
    parts=[
        Part.from_text('I want recommendations for movies about alien'
                       ' visitors, starring Will Smith'),
    ],
)
# Send the user prompt to the model
response = model.generate_content(user_prompt_content)

创建函数

上面的函数声明向模型描述了我们的函数，但实际上并没有实现任何数据处理或查询功能（尚未实现）。要创建实际函数，我们需要了解模型如何提出函数调用。

如果你刚刚入门，Vertex AI Studio 尤其有用，因为我们只需选择模型、生成配置并提供函数声明即可。请注意，在 Vertex AI Studio 中提供函数声明时，函数声明必须是有效的 JSON 格式。

使用 Vertex AI Studio，我们可以测试前面描述的每个场景的用户提示，看看模型在提供函数声明时是如何响应的：

不相关的查询：

美味素食食谱
很抱歉，我无法满足这个请求。现有工具缺乏所需的功能。

会话查询：指定情节和演员：

我想推荐威尔-史密斯主演的关于外星来客的电影
search_for_movies({“actor”: “Will Smith”, “plot”: “alien visitors”})

不指定情节，只指定演员和类型的查询：

西格妮-韦弗动作片
search_for_movies({“actor”: “Sigourney Weaver”, “genre”: “Action”})

考虑到上述响应，我们可能希望应用程序和函数能够处理每一种响应：

如果没有提出函数调用，则提供模型响应（response.text）。
如果只指定了 plot，则使用 MongoDB Atlas 向量索引执行向量搜索。
如果定义了情节以及流派和演员中的一个或两个，则在向量搜索中添加预过滤阶段。
如果未指定情节，则针对集合执行简单匹配。

我们可以使用 PyMongo 连接到 MongoDB 集群，并执行任何所需的查询。例如:

import pymongo
from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel
def get_mongo_client(mongo_uri):
    """Establish connection to the MongoDB Atlas cluster."""
    try:
        client = pymongo.MongoClient(mongo_uri)
        print('Connection to MongoDB successful.')
        return client
    except pymongo.errors.ConnectionFailure as e:
        print(f'Connection failed: {e}')
        return None
# Connect to MongoDB
mongo_client = get_mongo_client(mongo_uri)
db = mongo_client['movies']
collection = db['movie_collection']
# Extract the function call arguments from the model proposal
if 'function_call' in response.candidates[0].content.parts[0].to_dict():
    function_call = response.candidates[0].content.parts[0].function_call
movie_query = {}
if 'plot' in function_call.args:
    movie_query['plot'] = function_call.args['plot']
if 'genre' in function_call.args:
    movie_query['genre'] = function_call.args['genre']
if 'actor' in function_call.args:
    movie_query['actor'] = function_call.args['actor']
# Helper functions for search_for_movies
def get_embedding(
    text: str,
    task: str,
) -> list[float]:
    """Get the embedding for the given text and task."""
    if not text.strip():
        print('Attempted to get embedding for empty text.')
        return None
    model = TextEmbeddingModel.from_pretrained(
        'textembedding-gecko@003')
    embeddings = model.get_embeddings(
        [TextEmbeddingInput(text, task)])
    return embeddings[0].values
def mongo_exp(genre, actor):
    """Build the aggregation stage for genre and actor"""
    if genre and actor:
        return {
            '$and': [
                {'cast': actor},
                {'genres': genre},
            ]
        }
    elif genre:
        return {'genres': genre}
    elif actor:
        return {'cast': actor}
def mongo_vector_search(plot, genre='', actor=''):
    """Perform vector search using MongoDB Atlas."""
    # Generate embedding for the user query
    query_embedding = get_embedding(plot, 'RETRIEVAL_QUERY')
    if query_embedding is None:
        print('Invalid or empty query, or embedding generation failed.')
        return None
    # Define the MongoDB vector search pipeline.
    pipeline = [
        {
            '$vectorSearch': {
                'index': 'vector_index',
                'queryVector': query_embedding,
                'path': 'embedding',
                'numCandidates': 150, # Use 150 nearest neighbors
                'limit': 5, # Return top 5 matches
            }
        },
        {
            '$project': {
                '_id': 0,
                'plot': '$fullplot',
                'cast': 1,
                'title': 1,
                'genres': 1,
            }
        },
    ]
    # Add a prefilter if either or both genre and actor are specified
    if genre or actor:
        pipeline[0]['$vectorSearch']['filter'] = mongo_exp(genre, actor)
    # Execute the search and return results as list
    return list (collection.aggregate(pipeline))
def mongo_match(genre='', actor=''):
    """Perform simple match when plot is not specified."""
    # Define the MongoDB match pipeline.
    pipeline = [
        {
            '$match': mongo_exp(genre, actor)
        },
        {
            '$sample': {'size': 5}
        },
        {
            '$project': {
                '_id': 0,
                'plot': '$fullplot',
                'cast': 1,
                'title': 1,
                'genres': 1,
            }
        },
    ]
    # Execute the search and return results as list
    return list (collection.aggregate(pipeline))
def search_for_movies(plot='', genre='', actor=''):
    """Search for movies using provided plot, genre and actor."""
    if not(plot or genre or actor):
        print('No search criteria specified.')
        return None
    elif plot:
        return mongo_vector_search(plot, genre, actor)
    else:
        return mongo_match(genre, actor)

将所有功能整合在一起

当模型提出函数调用时，我们还需要向模型提供函数输出。在我们的例子中，输出将是通过 MongoDB 集合查询返回的电影列表。

# List of query results from MongoDB
movie_list = search_for_movies(**movie_query)
# Return the results to Gemini together with the user prompt and 
# function call proposal
response = model.generate_content(
    [
        user_prompt_content,
        response.candidates[0].content,
        Content(
            parts=[
                Part.from_function_response(
                    name='search_for_movies',
                    response={
                        "content": movie_list,
                    },
                ),
            ],
        )
    ]
)

现在，我们可以测试各种场景，看看模型响应与每个查询的相关性如何。利用所提供的工具，Gemini 可以结合使用矢量搜索和 MongoDB Atlas 的简单匹配，高效地为各种查询提供相关答案。

另外，我们还可以看到，当给出一个未指定任何参数的查询时（例如，一个简单的电影推荐请求），Gemini 会响应一个进一步了解详细信息的请求。

文章来源：https://medium.com/google-cloud/adding-context-to-rag-with-gemini-function-calling-d6e78e705524

标签：

MongoDB 人工智能

0 评论

欢迎关注ATYUN官方公众号

商务合作及内容投稿请联系邮箱:bd@atyun.com

上一篇使用JAX进行AI模型训练

下一篇文本分类：微调Transformer模型

评论登录

要发表评论，您必须先登录。

jonatasgrosman/wav2vec2-large-xlsr-53-english facebook/dino-vitb16 bert-base-uncased xlm-roberta-large xlm-roberta-base gpt2 microsoft/resnet-50 facebook/dino-vits8

AGENTIC AI如何塑造未来