创建具有OpenAI函数调用功能的代理

2024年05月07日由 alex 发表 339 0

人工智能能做什么

人工智能通过以下方式带来重大改进：

提供对话式体验，让交互感觉更自然、更友好。
将应用程序功能整合到一个入口。传统的应用程序需要不同的页面和表单来实现不同的功能，而人工智能则不同，它可以解读用户输入，无缝选择并执行必要的功能，甚至可以逐步处理复杂的请求。

然而，一个重大挑战依然存在：如何处理人工智能模型返回的非结构化文本数据。

传统上，从模型输出中提取结构化数据（如 JSON）需要复杂的提示工程或正则表达式 (RegEx)，由于人工智能模型的不可预测性和用户输入的可变性，这可能容易出错且不一致。

为解决这一问题，OpenAI 推出了两项创新功能： Json 模式和函数调用。本文将深入探讨函数调用功能，说明它如何简化从模型输出中提取结构化数据的过程，并辅以 TypeScript 代码示例。

当前的挑战

结构化数据处理：以前，开发人员依赖 RegEx 或复杂的提示工程来解析文本数据，这一过程充满了复杂性和错误。函数调用简化了这一过程，允许模型处理用户定义的函数，无需繁琐的技术即可生成 JSON 等结构化输出。
一致性和可预测性：函数调用可定义自定义函数以进行精确的信息提取，从而确保人工智能模型产生一致且可预测的结果。这保证了不同输入的结构化和可靠输出，对于需要可靠数据提取以进行文本摘要、文档分析以及与外部 API 或数据库集成的应用来说至关重要。

工作原理

根据 OpenAI 的文档，函数调用的基本步骤顺序如下：

使用用户查询和在 functions(tools) 参数中定义的函数集调用模型。
模型可以选择调用一个或多个函数；如果选择调用，内容将是符合自定义模式的字符串化 JSON 对象（注意：模型可能会幻觉参数）。
在代码中将字符串解析为 JSON，如果存在参数，则使用提供的参数调用函数。
将函数响应作为新消息附加到模型中，再次调用模型，并让模型将结果汇总后返回给用户。

四个关键概念，起初可能令人困惑：

工具：该术语Functions已被贬值并被替换Tools。目前，Tools仅支持本质上的函数类型。本质上，这种变化是在名称和语法上。
工具描述：当我们说把 "工具 "传递给模型时，请把它看作是提供了一个模型能做什么的列表或菜单，而不是实际的函数。这就好比告诉模型："这里有你可以选择执行的操作"，但并没有给它直接的操作方法。
函数调用返回：当模型提出函数调用建议时，它基本上是在说："我认为我们应该使用这个工具，这里有我们需要的东西"，它命名了函数并指定了任何所需的信息（arguments）。不过，此时这只是一个建议；实际操作将在应用程序中执行。
使用响应通知下一步：一旦在应用程序中实际执行了函数并得到了结果，就可以将这些结果作为新提示的一部分反馈给模型。这有助于模型了解发生了什么，并指导它采取下一步行动或做出响应。

构建代理的分步指南

业务案例：开发农场旅行助理代理

我们的目标是开发一个农场旅行助理代理，旨在提升用户规划农场参观的体验。该数字助理将通过以下方式提供全面支持：

根据用户所在位置确定最佳农场目的地。
提供每个农场现有活动的详细信息。
为预订所选活动提供便利。
必要时提供直接的投诉程序。

应用架构：

该流程图显示了应用程序的架构：

先决条件：

OpenAI API 密钥：可从 OpenAI 平台获取。

第 1 步：准备调用模型：

要启动对话，首先需要系统消息和用户任务提示：

创建一个messages数组，记录对话历史。
在messages数组中包含一条系统消息，以确定助手的角色和上下文。
用问候语欢迎用户，并提示他们指定任务。
将用户提示添加到messages数组中。

const messages: ChatCompletionMessageParam[] = [];
console.log(StaticPrompts.welcome);
messages.push(SystemPrompts.context);
const userPrompt = await createUserMessage();
messages.push(userPrompt);

根据我的个人偏好，所有提示都存储在对象中，以便于访问和修改。请参考以下代码片段，了解应用程序中使用的所有提示。你可以根据自己的需要采用或修改这种方法。

StaticPrompts：在整个对话过程中使用的静态信息。

export const StaticPrompts = {
  welcome:
    "Welcome to the farm assistant! What can I help you with today? You can ask me what I can do.",
  fallback: "I'm sorry, I don't understand.",
  end: "I hope I was able to help you. Goodbye!",
} as const;

UserPrompts：根据用户输入生成的用户信息。

import OpenAI from "openai";
type ChatCompletionUserMessageParam = OpenAI.ChatCompletionUserMessageParam;
type UserPromptKey = "task";
type UserPromptValue = (userInput?: string) => ChatCompletionUserMessageParam;
export const UserPrompts: Record<UserPromptKey, UserPromptValue> = {
  task: (userInput) => ({
    role: "user",
    content: userInput || "What can you do?",
  }),
};

SystemPrompts：根据系统上下文生成的系统信息。

import OpenAI from "openai";
type ChatCompletionSystemMessageParam = OpenAI.ChatCompletionSystemMessageParam;
type SystemPromptKey = "context";
export const SystemPrompts: Record<
  SystemPromptKey,
  ChatCompletionSystemMessageParam
> = {
  context: {
    role: "system",
    content:
      "You are an farm visit assistant. You are upbeat and friendly. You introduce yourself when first saying `Howdy!`. If you decide to call a function, you should retrieve the required fields for the function from the user. Your answer should be as precise as possible. If you have not yet retrieve the required fields of the function completely, you do not answer the question and inform the user you do not have enough information.",
  },
};

FunctionPrompts：函数信息，基本上是函数的返回值。

import OpenAI from "openai";
type ChatCompletionToolMessageParam = OpenAI.ChatCompletionToolMessageParam;
type FunctionPromptKey = "function_response";
type FunctionPromptValue = (
  args: Omit<ChatCompletionToolMessageParam, "role">
) => ChatCompletionToolMessageParam;
export const FunctionPrompts: Record<FunctionPromptKey, FunctionPromptValue> = {
  function_response: (options) => ({
    role: "tool",
    ...options,
  }),
};

第 2 步：定义工具

如前所述，工具本质上是对模型可调用功能的描述。在本例中，我们定义了四个工具来满足农场旅行助理代理的要求：

get_farms：根据用户的位置检索农场目的地列表。
get_activities_per_farm：提供特定农场可用活动的详细信息。
book_activity：方便预订所选活动。
file_complaint：提供直接的投诉流程。

下面的代码片段演示了如何定义这些工具：

import OpenAI from "openai";
import {
  ConvertTypeNameStringLiteralToType,
  JsonAcceptable,
} from "../utils/type-utils.js";
type ChatCompletionTool = OpenAI.ChatCompletionTool;
type FunctionDefinition = OpenAI.FunctionDefinition;
// An enum to define the names of the functions. This will be shared between the function descriptions and the actual functions
export enum DescribedFunctionName {
  FileComplaint = "file_complaint",
  getFarms = "get_farms",
  getActivitiesPerFarm = "get_activities_per_farm",
  bookActivity = "book_activity",
}
// This is a utility type to narrow down the `parameters` type in the `FunctionDefinition`.
// It pairs with the keyword `satisfies` to ensure that the properties of parameters are correctly defined.
// This is a workaround as the default type of `parameters` in `FunctionDefinition` is `type FunctionParameters = Record<string, unknown>` which is overly broad.
type FunctionParametersNarrowed<
  T extends Record<string, PropBase<JsonAcceptable>>
> = {
  type: JsonAcceptable; // basically all the types that JSON can accept
  properties: T;
  required: (keyof T)[];
};
// This is a base type for each property of the parameters
type PropBase<T extends JsonAcceptable = "string"> = {
  type: T;
  description: string;
};
// This utility type transforms parameter property string literals into usable types for function parameters.
// Example: { email: { type: "string" } } -> { email: string }
export type ConvertedFunctionParamProps<
  Props extends Record<string, PropBase<JsonAcceptable>>
> = {
  [K in keyof Props]: ConvertTypeNameStringLiteralToType<Props[K]["type"]>;
};
// Define the parameters for each function
export type FileComplaintProps = {
  name: PropBase;
  email: PropBase;
  text: PropBase;
};
export type GetFarmsProps = {
  location: PropBase;
};
export type GetActivitiesPerFarmProps = {
  farm_name: PropBase;
};
export type BookActivityProps = {
  farm_name: PropBase;
  activity_name: PropBase;
  datetime: PropBase;
  name: PropBase;
  email: PropBase;
  number_of_people: PropBase<"number">;
};
// Define the function descriptions
const FunctionDescriptions: Record<
  DescribedFunctionName,
  FunctionDefinition
> = {
  [DescribedFunctionName.FileComplaint]: {
    name: DescribedFunctionName.FileComplaint,
    description: "File a complaint as a customer",
    parameters: {
      type: "object",
      properties: {
        name: {
          type: "string",
          description: "The name of the user, e.g. John Doe",
        },
        email: {
          type: "string",
          description: "The email address of the user, e.g. john@doe.com",
        },
        text: {
          type: "string",
          description: "Description of issue",
        },
      },
      required: ["name", "email", "text"],
    } satisfies FunctionParametersNarrowed<FileComplaintProps>,
  },
  [DescribedFunctionName.getFarms]: {
    name: DescribedFunctionName.getFarms,
    description: "Get the information of farms based on the location",
    parameters: {
      type: "object",
      properties: {
        location: {
          type: "string",
          description: "The location of the farm, e.g. Melbourne VIC",
        },
      },
      required: ["location"],
    } satisfies FunctionParametersNarrowed<GetFarmsProps>,
  },
  [DescribedFunctionName.getActivitiesPerFarm]: {
    name: DescribedFunctionName.getActivitiesPerFarm,
    description: "Get the activities available on a farm",
    parameters: {
      type: "object",
      properties: {
        farm_name: {
          type: "string",
          description: "The name of the farm, e.g. Collingwood Children's Farm",
        },
      },
      required: ["farm_name"],
    } satisfies FunctionParametersNarrowed<GetActivitiesPerFarmProps>,
  },
  [DescribedFunctionName.bookActivity]: {
    name: DescribedFunctionName.bookActivity,
    description: "Book an activity on a farm",
    parameters: {
      type: "object",
      properties: {
        farm_name: {
          type: "string",
          description: "The name of the farm, e.g. Collingwood Children's Farm",
        },
        activity_name: {
          type: "string",
          description: "The name of the activity, e.g. Goat Feeding",
        },
        datetime: {
          type: "string",
          description: "The date and time of the activity",
        },
        name: {
          type: "string",
          description: "The name of the user",
        },
        email: {
          type: "string",
          description: "The email address of the user",
        },
        number_of_people: {
          type: "number",
          description: "The number of people attending the activity",
        },
      },
      required: [
        "farm_name",
        "activity_name",
        "datetime",
        "name",
        "email",
        "number_of_people",
      ],
    } satisfies FunctionParametersNarrowed<BookActivityProps>,
  },
};
// Format the function descriptions into tools and export them
export const tools = Object.values(
  FunctionDescriptions
).map<ChatCompletionTool>((description) => ({
  type: "function",
  function: description,
}));

了解功能描述

功能描述需要以下按键：

name：标识功能。
description：描述：概述函数的作用。
paramenters：定义函数的参数，包括类型、描述和是否必填。
type：类型：指定参数类型，通常为对象。
properties：详细说明每个参数，包括其类型和描述。
required：列出函数运行所必需的参数。

添加新函数

引入新函数的步骤如下：

用一个新的枚举（如 DoNewThings）扩展 DescribedFunctionName。
为参数定义 Props 类型，例如 DoNewThingsProps。
在 FunctionDescriptions 对象中插入一个新条目。
在函数目录中实现新函数，并以枚举值命名。

第 3 步：使用消息和工具调用模型

设置好消息和工具后，我们就可以使用它们调用模型了。

值得注意的是，截至 2024 年 3 月，只有 gpt-3.5-turbo-0125 和 gpt-4-turbo-preview 模型支持函数调用。

代码实现：

export const startChat = async (messages: ChatCompletionMessageParam[]) => {
  const response = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    top_p: 0.95,
    temperature: 0.5,
    max_tokens: 1024,
    messages, // The messages array we created earlier
    tools, // The function descriptions we defined earlier
    tool_choice: "auto", // The model will decide whether to call a function and which function to call
  });
  const { message } = response.choices[0] ?? {};
  if (!message) {
    throw new Error("Error: No response from the API.");
  }
  messages.push(message);
  return processMessage(message);
};

工具选择选项

tool_choice 选项控制模型的函数调用方法：

特定函数：要指定一个特定函数，请将 tool_choice 设置为类型为 "函数 "的对象，并包含函数名称和详细信息： "函数 "的对象，并包含函数名称和详细信息。例如，tool_choice: { type： "function", function： {名称："get_farms"}} 会告诉模型调用 get_farms 函数，而不管上下文是什么。即使是简单的用户提示，如 "Hi."，也会触发该函数调用。
无函数：要让模型在不调用任何函数的情况下生成响应，请使用 tool_choice: "none"。此选项会提示模型仅依靠输入信息来生成响应。
自动选择：默认设置 tool_choice："auto"（自动）允许模型根据对话的上下文自主决定是否调用函数以及调用哪个函数。这种灵活性有利于对函数调用进行动态决策。

第 4 步：处理模型响应

模型的响应主要分为两类，其中一类可能会出错，因此需要回退信息：

1. 函数调用请求：模型表示希望调用函数。这是函数调用的真正潜力所在。模型会根据上下文和用户查询智能地选择要执行的函数。例如，如果用户询问农场推荐，模型可能会建议调用 get_farms 函数。

但这并不仅限于此，模型还会分析用户输入，以确定其中是否包含函数调用所需的信息（参数）。如果没有，模型会提示用户提供缺失的详细信息。

一旦收集到所有必要信息（参数），模型就会返回一个 JSON 对象，详细说明函数名称和参数。这种结构化的响应可以毫不费力地转化为应用程序中的 JavaScript 对象，使我们能够无缝地调用指定的函数，从而确保流畅的用户体验。

此外，模型可以选择同时或按顺序调用多个函数，每个函数都需要特定的细节。在应用程序中管理这些细节对于流畅运行至关重要。

模型响应示例：

{
  "role": "assistant",
  "content": null,
  "tool_calls": [
    {
      "id": "call_JWoPQYmdxNXdNu1wQ1iDqz2z",
      "type": "function",
      "function": {
        "name": "get_farms", // The function name to be called
        "arguments": "{\"location\":\"Melbourne\"}" // The arguments required for the function
      }
    }
    ... // multiple function calls can be present
  ]
}

2. 纯文本回复：模型提供直接的文本回复。这是我们所习惯的人工智能模型的标准输出，为用户查询提供直接的答案。对于这些回复，只需返回文本内容即可。

模型响应示例：

{
  "role": "assistant",
  "content": {
    "text": "I can help you with that. What is your location?"
  }
}

区别的关键在于是否存在用于函数调用的 tool_calls 键。如果存在 tool_calls，则模型请求执行一个函数；否则，它将提供一个直接的文本响应。

要处理这些响应，可以考虑采用以下基于响应类型的方法：

type ChatCompletionMessageWithToolCalls = RequiredAll<
  Omit<ChatCompletionMessage, "function_call">
>;
// If the message contains tool_calls, it extracts the function arguments. Otherwise, it returns the content of the message.
export function processMessage(message: ChatCompletionMessage) {
  if (isMessageHasToolCalls(message)) {
    return extractFunctionArguments(message);
  } else {
    return message.content;
  }
}
// Check if the message has `tool calls`
function isMessageHasToolCalls(
  message: ChatCompletionMessage
): message is ChatCompletionMessageWithToolCalls {
  return isDefined(message.tool_calls) && message.tool_calls.length !== 0;
}
// Extract function name and arguments from the message
function extractFunctionArguments(message: ChatCompletionMessageWithToolCalls) {
  return message.tool_calls.map((toolCall) => {
    if (!isDefined(toolCall.function)) {
      throw new Error("No function found in the tool call");
    }
    try {
      return {
        tool_call_id: toolCall.id,
        function_name: toolCall.function.name,
        arguments: JSON.parse(toolCall.function.arguments),
      };
    } catch (error) {
      throw new Error("Invalid JSON in function arguments");
    }
  });
}

然后，从函数调用中提取的参数将用于执行应用程序中的实际函数，而文本内容则有助于进行对话。

下面是一个 if-else 模块，说明了这一过程是如何展开的：

const result = await startChat(messages);
if (!result) {
  // Fallback message if response is empty (e.g., network error)
  console.log(StaticPrompts.fallback);
} else if (isNonEmptyString(result)) {
  // If the response is a string, log it and prompt the user for the next message
  console.log(`Assistant: ${result}`);
  const userPrompt = await createUserMessage();
  messages.push(userPrompt);
} else {
  // If the response contains function calls, execute the functions and call the model again with the updated messages
  for (const item of result) {
    const { tool_call_id, function_name, arguments: function_arguments } = item;
    // Execute the function and get the function return
    const functionReturn = await AvailableFunctions[
      function_name as keyof typeof AvailableFunctions
    ](function_arguments);
    // Add the function output back to the messages with a role of "tool", the id of the tool call, and the function return as the content
    messages.push(
      FunctionPrompts.function_response({
        tool_call_id,
        content: functionReturn,
      })
    );
  }
}

第 5 步：执行函数并再次调用模型

当模型请求函数调用时，我们会在应用程序中执行该函数，然后用新信息更新模型。这样，模型就能随时了解函数的结果，从而给用户一个中肯的答复。

保持函数执行的正确顺序至关重要，尤其是当模型选择按顺序执行多个函数来完成一项任务时。使用 for 循环而不是 Promise.all 可以保持执行顺序，这对成功的工作流程至关重要。但是，如果函数是独立的，可以并行执行，则应考虑自定义优化以提高性能。

下面是执行函数的方法：

for (const item of result) {
  const { tool_call_id, function_name, arguments: function_arguments } = item;
  console.log(
    `Calling function "${function_name}" with ${JSON.stringify(
      function_arguments
    )}`
  );
  // Available functions are stored in an object for easy access
  const functionReturn = await AvailableFunctions[
    function_name as keyof typeof AvailableFunctions
  ](function_arguments);
}

下面是如何用函数响应更新消息数组：

for (const item of result) {
  const { tool_call_id, function_name, arguments: function_arguments } = item;
  console.log(
    `Calling function "${function_name}" with ${JSON.stringify(
      function_arguments
    )}`
  );
  const functionReturn = await AvailableFunctions[
    function_name as keyof typeof AvailableFunctions
  ](function_arguments);
  // Add the function output back to the messages with a role of "tool", the id of the tool call, and the function return as the content
  messages.push(
    FunctionPrompts.function_response({
      tool_call_id,
      content: functionReturn,
    })
  );
}

可调用的函数示例：

// Mocking getting farms based on location from a database
export async function get_farms(
  args: ConvertedFunctionParamProps<GetFarmsProps>
): Promise<string> {
  const { location } = args;
  return JSON.stringify({
    location,
    farms: [
      {
        name: "Farm 1",
        location: "Location 1",
        rating: 4.5,
        products: ["product 1", "product 2"],
        activities: ["activity 1", "activity 2"],
      },
      ...
    ],
  });
}

带有函数响应的工具信息示例：

{
  "role": "tool",
  "tool_call_id": "call_JWoPQYmdxNXdNu1wQ1iDqz2z",
  "content": {
    // Function return value
    "location": "Melbourne",
    "farms": [
      {
        "name": "Farm 1",
        "location": "Location 1",
        "rating": 4.5,
        "products": [
          "product 1",
          "product 2"
        ],
        "activities": [
          "activity 1",
          "activity 2"
        ]
      },
      ...
    ]
  }
}

第 6 步：将结果汇总反馈给用户

运行函数并更新消息数组后，我们用这些更新的消息重新启动模型，向用户简要介绍结果。这需要通过循环反复调用 startChat 函数。

为避免无休止地循环，监控用户输入（如 “再见 ”或 “结束”）标志着对话结束，确保循环适当终止是至关重要的。

代码实现：

const CHAT_END_SIGNALS = [
  "end",
  "goodbye",
  ...
];
export function isChatEnding(
  message: ChatCompletionMessageParam | undefined | null
) {
  // If the message is not defined, log a fallback message
  if (!isDefined(message)) {
    throw new Error("Cannot find the message!");
  }
  // Check if the message is from the user
  if (!isUserMessage(message)) {
    return false;
  }
  const { content } = message;
  return CHAT_END_SIGNALS.some((signal) => {
    if (typeof content === "string") {
      return includeSignal(content, signal);
    } else {
      // content has a typeof ChatCompletionContentPart, which can be either ChatCompletionContentPartText or ChatCompletionContentPartImage
      // If user attaches an image to the current message first, we assume they are not ending the chat
      const contentPart = content.at(0);
      if (contentPart?.type !== "text") {
        return false;
      } else {
        return includeSignal(contentPart.text, signal);
      }
    }
  });
}
function isUserMessage(
  message: ChatCompletionMessageParam
): message is ChatCompletionUserMessageParam {
  return message.role === "user";
}
function includeSignal(content: string, signal: string) {
  return content.toLowerCase().includes(signal);
}

结论

OpenAI 的函数调用代表了人工智能领域的一大进步，它允许模型执行自定义函数来响应用户查询。这一功能简化了从输出中获取结构化数据的过程，改善了用户交互，并实现了更复杂的交流。

文章来源：https://medium.com/towards-data-science/create-an-agent-with-openai-function-calling-capabilities-ad52122c3d12

标签：

人工智能大型语言模型

0 评论

欢迎关注ATYUN官方公众号

商务合作及内容投稿请联系邮箱:bd@atyun.com

上一篇微调大型语言模型：根据你的需求定制Llama3-8B

下一篇 IN2：改善LLM长上下文窗口的数据训练

评论登录

要发表评论，您必须先登录。

jonatasgrosman/wav2vec2-large-xlsr-53-english facebook/dino-vitb16 bert-base-uncased xlm-roberta-large xlm-roberta-base gpt2 microsoft/resnet-50 facebook/dino-vits8

AGENTIC AI如何塑造未来