摘要
本研究提出了一种变通方法,以解决 Gemini API 目前无法直接处理来自 URL 的网页内容的问题。该方法利用 Google Apps Script 从指定的 URL 中提取相关信息,并将其输入 API 进行摘要。在 API 的局限性得到解决之前,这种方法为从基于网络的内容中生成综合摘要提供了一种解决方案。
简介
虽然 Gemini API 提供了强大的文本生成功能,但目前在直接访问和处理来自 URL 的网页内容方面却面临着限制。当提示对特定 URL 上的文章进行摘要时,如摘要以下 URL 上的文章。https://###,API 通常会返回错误信息,表示无法检索到必要的信息。这种限制源于 API 当前的设计,它可能不具备处理网络请求和解析 HTML 内容的能力。
虽然这一限制有望在未来的更新中得到解决,但为了解决这一限制,并为从网页生成摘要提供可行的解决方案,作为目前的一种变通方法,本报告提出了一种利用谷歌应用脚本的方法。利用脚本与 Web API 交互和处理 HTML 数据的能力,我们可以有效地从给定的 URL 中提取相关信息,并将其输入 Gemini API 进行摘要。这种方法解决了 API 目前的局限性,使我们能够从基于网络的内容中生成更全面、更准确的摘要。
解决方法的步骤
使用方法
1. 创建Google Apps脚本项目
请创建一个 Google Apps 脚本项目。在这种情况下,容器绑定脚本和独立脚本均可使用。
2. 创建 API 密钥
请访问 https://ai.google.dev/gemini-api/docs/api-key 并创建 API 密钥。然后,请在 API 控制台启用生成语言 API。本示例脚本将使用此 API 密钥。
当然,如果你能在复制的电子表格中将谷歌云平台项目与谷歌应用程序脚本项目链接起来,也可以使用访问令牌。
3. 安装谷歌应用脚本库
本脚本使用了 Google Apps 脚本库 GeminiWithFiles。因此,请安装它。
4. 脚本
请将以下脚本复制并粘贴到已创建的 Google Apps 脚本项目的脚本编辑器中。
请在函数 main 中设置 API 密钥和 URL。
/**
* ### Description
* Convert HTML of the inputted URL to PDF blob.
*
* @param {Object} object Object for running this method.
* @param {String} object.url URL you want to use.
* @param {Boolean} object.convertByGoogleDoc When this is true, in order to convert HTML to PDF, Google Document is used. I think that the most cases are not required to use this. But, if you use this, please set "convertByGoogleDoc" as true. The default value is false.
*
* @return {Blob} PDF blob converted from HTML of the URL is returned.
*/
function convertHTMLToPDFBlob_(object) {
const { url, convertByGoogleDoc = false } = object;
console.log(`--- Get HTML from "${url}".`);
const res = UrlFetchApp.fetch(url, { muteHttpExceptions: true });
let text = res.getContentText();
if (res.getResponseCode() != 200) {
throw new Error(text);
}
console.log(`--- Convert image data.`);
// Convert the source URL of img tag to the data URL.
text.matchAll(/<img.*?>/g).forEach(e => {
const t = e[0].match(/src\=["'](http.*?)["']/);
if (t) {
const imageUrl = t[1];
const r = UrlFetchApp.fetch(imageUrl.trim(), { muteHttpExceptions: true });
if (r.getResponseCode() == 200) {
const blob = r.getBlob();
const dataUrl = `data:${blob.getContentType()};base64,${Utilities.base64Encode(blob.getBytes())}`;
text = text.replace(imageUrl, dataUrl);
}
}
});
// For medium
if (url.includes("medium.com")) {
text.matchAll(/<picture>.*?<\/picture>/g).forEach(e => {
const t = e[0].match(/srcSet\=["'](http.*?)["']/);
if (t) {
const imageUrl = t[1].split(" ")[0].trim();
const r = UrlFetchApp.fetch(imageUrl.trim(), { muteHttpExceptions: true });
if (r.getResponseCode() == 200) {
const blob = r.getBlob();
const dataUrl = `data:${blob.getContentType()};base64,${Utilities.base64Encode(blob.getBytes())}`;
text = text.replace(e[0], `<img src="${dataUrl}"`);
}
}
});
}
let pdfBlob;
if (convertByGoogleDoc) {
console.log(`--- Convert HTML to PDF blob with Google Docs.`);
const doc = Drive.Files.create({ name: "temp", mimeType: MimeType.GOOGLE_DOCS }, Utilities.newBlob(text, MimeType.HTML));
pdfBlob = DriveApp.getFileById(doc.id).getBlob().setName(url);
Drive.Files.remove(doc.id);
} else {
console.log(`--- Convert HTML to PDF blob.`);
pdfBlob = Utilities.newBlob(text, MimeType.HTML).getAs(MimeType.PDF).setName(url);
}
console.log(`--- Complately converted HTML to PDF blob.`);
return pdfBlob;
}
// Please run this function.
function main() {
const apiKey = "###"; // Please set your API key.
// Please set your URLs you want to use.
// These are the samples.
const urls = [
"https://tanaikech.github.io/2024/06/15/unlock-smart-invoice-management-gemini-gmail-and-google-apps-script-integration/",
"https://tanaikech.github.io/2024/08/08/a-novel-approach-to-learning-combining-gemini-with-google-apps-script-for-automated-qa/",
];
// Prompt
const jsonSchema = {
description: "Summarize the articles of the following PDF files within 100 words, respectively. Return the result as an array.",
type: "array",
items: {
type: "object",
properties: {
url: { type: "string", description: "Filename" },
summary: { type: "string", description: "Summary of PDF." }
},
required: ["summary"],
additionalProperties: false,
}
};
const q = `Follow JSON schema.<jsonSchema>${JSON.stringify(jsonSchema)}</jsonSchema>`;
// Upload PDF
const blobs = urls.map(url => convertHTMLToPDFBlob_({ url, convertByGoogleDoc: false }));
// Generate content with Gemini API.
const g = GeminiWithFiles.geminiWithFiles({ apiKey, response_mime_type: "application/json" });
const fileList = g.setBlobs(blobs).uploadFiles();
const res = g.withUploadedFilesByGenerateContent(fileList).generateContent({ q });
console.log(res);
}
运行主程序后,结果如下
[
{
"url": "https://tanaikech.github.io/2024/06/15/unlock-smart-invoice-management-gemini-gmail-and-google-apps-script-integration/",
"summary": "This article describes an invoice processing application built with Google Apps Script that leverages Gemini, a large language model, to automate the parsing of invoices received as email attachments and streamlines the processing of invoices. It details how the application retrieves emails from Gmail, uses the Gemini API to parse the extracted invoices, and leverages time-driven triggers for automatic execution."
},
{
"url": "https://tanaikech.github.io/2024/08/08/a-novel-approach-to-learning-combining-gemini-with-google-apps-script-for-automated-qa/",
"summary": "This article proposes a novel learning method using Gemini to automate Q&A generation, addressing the challenges of manual Q&A creation. By integrating with Google tools, this approach aims to enhance learning efficiency, accessibility, and personalization while reducing costs. It presents a groundbreaking learning approach that integrates Gemini with widely used Google tools: Forms, Spreadsheets, and Apps Script as an application implemented on Google Spreadsheet."
}
]
可以看到,每个 URL 的 HTML 内容摘要都能正确生成。
如果要使用单个 URL,请使用 const urls = [“###URL###”];.