以上是你将要构建的copilot功能之一的示例。%graphs 发信号告诉我们的 copilot,我需要提出与图形相关的问题。它可以接收单元格引用,如--in16,其中包含我们需要分析的图形。此外,输入提示指定了你需要询问的有关图形的信息,它还能输出准确的结果。它可以在 Anaconda Jupyter Notebooks、VS Code Notebooks、Jupyter Lab 或任何本地笔记本环境中运行。
设置舞台
要创建 Copilot 的功能,第一步是初始化 Gemini MultiModel。为此,你需要安装一些库:
# Install necessary libraries
pip install -q -U google-generativeai grpcio grpcio-tools
现在,我们需要导入必要的库,以获取 Gemini LLM API 调用并实例化所需的 API 密钥。
# Import the Google Generative AI library
import google.generativeai as genai
# Initialize the GenerativeModel with 'gemini-pro' for chat and code
text_model = genai.GenerativeModel('gemini-pro')
# Initialize the GenerativeModel with 'gemini-pro-vision' for graphs
image_model = genai.GenerativeModel('gemini-pro-vision')
# Configure the library with your API key
genai.configure(api_key="Your-API-key")
我们已经加载了两个模型:gemini-pro,作为我们生成代码或进行代码相关对话的文本模型;gemini-pro-vision,将用于管理 Copilot 的图像相关功能。接下来,我们需要导入用于创建 Copilot 功能的库。
# Regular expression for pattern matching
import re
# IPython for working with IPython environment
import IPython
# OS for interacting with the operating system
import os
# JSON for working with JSON data
import json
# Base64 for encoding and decoding base64 data
import base64
# Image class from IPython.display for displaying images
from IPython.display import Image
# register_line_magic for registering custom magic commands
from IPython.core.magic import register_line_magic
让我们开始编码 Copilot 的一个简单功能,即聊天。从这个功能开始的原因是,当我们构建更复杂的功能时,它将使我们更容易理解后面的代码。
简单的聊天功能
你正在笔记本中编码,然后你意识到需要向 ChatGPT 询问一些问题。为了避免切换到浏览器选项卡进行聊天,我们将创建一个聊天功能,允许你在代码单元旁边聊天。我们的“chat”功能需要一个输入,这就是我们的提示,作为响应,Gemini 文本模型将提供答案。
# Registering a Jupyter Notebook magic command named 'chat'
@register_line_magic
def chat(contents):
# Generating a response using the 'generate_content' method of the 'text_model' object
# The method takes a formatted string containing the provided 'contents'
response = text_model.generate_content(f'''
Answer the question in a short quick readable paragraph, dont provide answer in any format or code
{contents}
''').text
# Printing the generated response to the output
print(response)
聊天函数中有两行很重要,其中一行是@register_line_magic 装饰器。它有助于我们用 %chat 而不是 chat( ) 调用函数。这让它更像人工智能的感觉,尽管这并不是必须的。第二个重要部分是使用的提示模板。之所以选择这个提示模板,是因为 Gemini 有一个习惯,那就是在大多数情况下都以 markdown 格式生成聊天回复。因此,有必要指示 Gemini 回复不能使用标记符或代码格式。你可以根据自己的需要更新提示模板。
你可以在任何代码单元格中使用 "chat "功能。为此,你需要通过 %chat [your_question],它就会打印出回复。
# Running Chat Feature
%chat What are some useful libraries for coding neural networks in Python
与代码聊天功能
此功能可让你在笔记本中与代码聊天,你不必单独使用 ChatGPT,只需前往 ChatGPT,粘贴代码并提问即可。“Chat with Code”功能需要两样东西,你的提示和你要提问的代码。
# Define a function named 'chatn' that takes 'contents' as a parameter
@register_line_magic
def chatn(contents):
try:
# Use regular expression to find all occurrences of '--in' followed by digits in 'contents'
numbers = [int(match.group().replace('--in', '')) for match in re.finditer(r'--in\d+', contents)]
# Remove the found pattern '--in\d+' from 'contents'
contents_filter = re.sub(r'--in\d+', '', contents)
# Check if there are any references (numbers) found
if numbers:
# Retrieve the current cell contents for all references using the IPython 'In' variable
current_cell_contents = [In[number] for number in numbers]
# Combine the contents into a single string with line breaks
combined_content = '\n'.join(current_cell_contents)
# Execute the text_model to generate response
response = text_model.generate_content(f'''
{combined_content}
Answer the question in a short readable paragraph, don't provide the answer in any format or code
{contents_filter}
''').text
# Print the generated response
print(response)
else:
# Print an error message if no references are found
print('Please provide a correct codeblock reference.')
except Exception as e:
# Print an error message if an exception occurs
print('Please provide a correct codeblock reference.')
让我们来了解一下我们的 chatn 函数。try-except 块用于避免在未传递单元格引用的情况下出错。我们要做的第一件事是使用 regex 提取单元格引用的所有--in 模式,并清理提示符,以避免将其传递给 Gemini API。对于单元格编号引用,我使用了 --in 格式,因为它更容易记忆。In[number]会从你在提示中提到的单元格编号中获取所有代码,将其合并,并与经过清理的提示一起传递给你。你可以传递任意数量的单元格引用,无需对它们进行排序。
要使用 "Chat with Code "功能,你需要传递 %chatn [cell references][your_question],然后它就会打印回复。
# Running Chat with Code Feature
%chatn --in17 --in11 I sum element wise but it is not working
你可能会认为这是一个非常简单的问题,但它对更复杂的代码也有效。
生成代码功能
生成代码是最重要的功能之一,你可能每时每刻都在使用它。我们将生成两个版本的代码,一个是根据提示生成代码,另一个是生成关系代码。简单的 "Generate Code "功能只需一个输入,即你的提示,它就会在下一个单元格中生成代码。
# Register a custom line magic command
@register_line_magic
def code(contents):
# Get the IPython shell instance
from IPython.core.getipython import get_ipython
shell = get_ipython()
# Generate code content using a text model
response = text_model.generate_content(f'''
write a python code that and dont answer anything else
{contents}
''').text
# Remove ``` and python from the response
response = response.replace('```', '')
# Clean up the response
response = response.replace('python', '').strip('\n').rstrip('\n').replace('```python', '')
# Prepare payload for setting the next input
payload = dict(
source='set_next_input',
text=response,
replace=False,
)
# Write the payload to the IPython shell
shell.payload_manager.write_payload(payload, single=False)
在我们的代码函数中,get_ipython 模块负责在你提供提示的当前单元格旁边生成代码。清理是必要的,因为生成的 Python 代码包含一些需要删除的额外字符。有效载荷将获取 Gemini 模型的响应,并创建一个新单元格来粘贴它。
要使用 "Generate Code"功能,你需要传入%code [your_prompt],它就会在下一个单元格中生成你要求的代码。
# Running Generate Code Feature
%code load my data.csv and take random sample of 100 rows
生成关系代码功能
关系编码功能非常重要,因为大多数情况下,你可能要在其他代码之上进行编码。好在这一功能与我们在 chatn 功能中使用的功能相同。“Relational Code”功能需要两样东西:你的提示和你要关联的代码。
# Define a function named 'coden' that takes 'contents' as a parameter
@register_line_magic
def coden(contents):
try:
# Get the IPython shell instance
from IPython.core.getipython import get_ipython
shell = get_ipython()
# Use regular expression to find all occurrences of '--in' followed by digits in 'contents'
numbers = [int(match.group().replace('--in', '')) for match in re.finditer(r'--in\d+', contents)]
# Remove the found pattern '--in\d+' from 'contents'
contents_filter = re.sub(r'--in\d+', '', contents)
# Check if there are any references (numbers) found
if numbers:
# Retrieve the current cell contents for all references using the IPython 'In' variable
current_cell_contents = [In[number] for number in numbers]
# Combine the contents into a single string with line breaks
combined_content = '\n'.join(current_cell_contents)
# Execute the text_model to generate code
response = text_model.generate_content(f'''{combined_content}
{contents_filter}
please write Python code and don't answer anything else, dont provide output of the code
''').text
# Remove ``` and python from the response
response = response.replace('```', '')
# Clean up the response
response = response.replace('python', '').strip('\n').rstrip('\n').replace('```python', '')
# Prepare payload for setting the next input
payload = dict(
source='set_next_input',
text=response,
replace=False,
)
# Write the payload to the IPython shell
shell.payload_manager.write_payload(payload, single=False)
else:
# Print an error message if no references are found
print('Please provide a correct codeblock reference.')
except Exception as e:
# Print an error message if an exception occurs
print('Please provide a correct codeblock reference.')
payload和清理文本代码从函数中使用code,而其余代码则从chatn函数中获取。要使用“Relational Code”功能,你需要传递%coden [cell references] [your_prompt],它将在下一个单元格中创建你请求的代码。你可以根据需要传递任意数量的单元格引用。
要使用“Relational Code”功能,你需要传递%code [cell_references] [your_prompt],它将在下一个单元格中创建你请求的代码。
# Running Relational Code Feature
%coden --in83 --in76 multiply y with each x item
与图表聊天功能
这个功能会比较复杂。让我们一步一步来构建它。首先,你必须以编程方式获取正在编写代码的文件名。
# Import the IPython module
import IPython
# Import the os module for interacting with the operating system
import os
# Extract the local variables from the IPython environment
file_path = IPython.extract_module_locals()[1]['__vsc_ipynb_file__']
# Extract the base name (file name) from the file path
file_name = os.path.basename(file_path)
# Return the file name
print(file_name)
############### OUTPUT ###############
myfile.ipynb
############### OUTPUT ###############
这只能在 VSCode 中使用,而不能在 Jupyter Lab 或 Anaconda 笔记本中使用。如果你不使用 VSCode,可以跳过这一步,因为我们的最终代码将具备这一功能,允许你在提示符中手动传递文件名。接下来,我们需要在 json 中加载这个笔记本。
# Import the json module for working with JSON data
import json
import base64
from IPython.display import Image
# Open the notebook file in read mode
with open(file_name, "r") as f:
# Load the content of the notebook file as JSON
notebook_json = json.load(f)
加载笔记本文件后,我们可以循环浏览数据,并获取存在图形的特定单元格输出。假设我们的图形存在于 65 号单元格。
# Import the base64 module for encoding and decoding base64 data
import base64
# Import the Image class from the IPython.display module for displaying images in an IPython environment
from IPython.display import Image
####### Cell Number #######
cell_number = 65
# Find the cell in the notebook JSON with execution count equal to 65
element = next(cell for cell in notebook_json['cells'] if 'execution_count' in cell and cell['execution_count'] == cell_number)
# Extract the base64-encoded PNG image data from the cell's outputs
image_data = element['outputs'][0]['data']['image/png']
# Decode the base64-encoded image data
image_base64 = base64.b64decode(image_data)
# Save the decoded image data as a JPG file in the local directory
with open('img_code.jpg', 'wb') as f:
f.write(image_base64)
# Assuming 'Image' is imported from the IPython.display module, load the saved image using the Image() function
image = Image(filename='img_code.jpg')
Gemini 图像模型只接受本地存储的图像,你必须保存提取的图形并使用图像模块加载图像。它将接收两个输入,一个是提示,另一个是包含图形的单元格引用。
# Try to get the current notebook filename using IPython
try:
file_name = IPython.extract_module_locals()[1]['__vsc_ipynb_file__']
# Extract the base name (file name) from the file path
file_name = os.path.basename(file_name)
except:
# If an exception occurs, print a message indicating no file
file_name = None
# Register a custom magic command for the Jupyter notebook
@register_line_magic
def graph(contents):
# Search for the pattern --in<number>
pattern = re.compile(r'--in\d+')
# Find the first occurrence of the pattern in the contents
match = pattern.search(contents)
# Remove the pattern from the contents
contents_filter = pattern.sub('', contents)
# Define a new pattern for --filename=<word>
pattern_f = re.compile(r'--filename=\w+')
# Find the first occurrence of the new pattern in the contents
match_f = pattern_f.search(contents)
# Remove the new pattern from the filtered contents
contents_filter = pattern_f.sub('', contents_filter)
# If the --in<number> pattern is found
if match:
# Get the global variable file_name
global file_name
# Check if file_name is available from the IPython magic command
if file_name:
notebookName = file_name
with open(notebookName, "r") as f:
# Load the notebook JSON data
notebook_json = json.load(f)
elif match_f:
# Extract the filename from the --filename=<word> pattern
match_c = match_f.group().replace('--filename=', '')
notebookName = match_c + '.ipynb'
with open(notebookName, "r") as f:
# Load the notebook JSON data
notebook_json = json.load(f)
else:
# If neither file_name nor --filename=<word> is provided, print an error message
return 'Please provide a correct file path using --filename=<filename>.ipynb, e.g., --filename=mycode.ipynb'
# Extract the number from the --in<number> pattern
number = int(match.group().replace('--in', ''))
# Find the cell with the specified execution_count in the notebook JSON data
element = next(cell for cell in notebook_json['cells'] if 'execution_count' in cell and cell['execution_count'] == number)
# Extract image data from the cell's output
image_data = element['outputs'][0]['data']['image/png']
# Decode base64 image data
image_base64 = base64.b64decode(image_data)
# Save the image in the local directory as img_code.jpg
with open('img_code.jpg', 'wb') as f:
f.write(image_base64)
# Load the image using the Image() function
image = Image(filename='img_code.jpg')
# extract information using image model
response = image_model.generate_content([contents_filter, image])
print(response.text)
else:
# If --in<number> pattern is not found, print an error message
print('Please provide a correct code block reference.')
与图表对话需要使用 image_model,我们对文件名进行了文本模式提取,其方法与单元格引用中的--in 方法相同。要使用 "Chat with Graph "功能,你需要传递 %graph [single_cell_reference] [your_prompt] [filename],它就会打印响应。
# Running Chat with Image Feature
%coden --in143 how many outliers are there
与文件聊天功能
小型项目通常依赖于多个 Python 文件。当你想在笔记本中与 py 文件聊天,而不是逐个检查它们的代码时,这个功能就很有用。“Chat with Files ”功能需要两样东西,你的提示符和包含 py 文件的文件夹名称。
# Register a custom magic command for IPython
@register_line_magic
def chatf(contents):
try:
# Parse the folder name from the provided argument
folder_match = re.search(r'--folder_name=(\S+)', contents)
if not folder_match:
# Print an error message if folder name is not provided in the correct format
print("Please provide a valid folder name using the format '--folder_name=<folder_name>'.")
return
# Extract the folder name from the regex match
folder_name = folder_match.group(1)
# Get a list of Python files in the specified folder
python_files = [file for file in os.listdir(folder_name) if file.endswith('.py')]
# Check if any Python files were found
if not python_files:
print(f"No Python files found in the folder '{folder_name}'.")
return
# Initialize an empty string to store combined content
combined_content = ""
# Iterate through each Python file in the folder
for file_name in python_files:
with open(os.path.join(folder_name, file_name), 'r') as file:
# Read the content of the file
file_content = file.read()
# Format the combined content with file name and its code
combined_content += f"\nfile: {file_name}\n{file_content}\n{'_'*15}\n"
# Remove the pattern of folder from the input contents
contents_filter = re.sub(r'--folder_name=\S+', '', contents)
# Generate content using a model and display the response
response = text_model.generate_content(f'''
{combined_content}
Answer the question in a short readable paragraph, don't provide the answer in any format or code
{contents_filter}
''').text
print(response)
except Exception as e:
# Print an error message if an exception occurs
print(f'An error occurred: {str(e)}')
chatf 函数将接收文件夹引用,与我们提供单元格引用的方式类似。然后,它将合并所有文件名及其内容,其余代码与 chatn 函数中的代码保持一致。要使用 "Chat with Files "功能,你需要传入 %chatf [single_folder_reference] [your_prompt],然后它会打印响应。
# Running chat with files Feature
%chatf --folder_name=myfolder How to clean and format data
编译功能
你不想为不同的项目反复输入每个功能函数,这将是一项耗时的任务。你可以将所有功能合并到一个 py 文件中。我将其命名为 my_copilot.py,然后就可以简单地导入这个模块,使用其中的任何功能。
# Importing all features of your copilot
from my_copilot import *
# using generate code feature
%code load my data.csv file using pandas