模型:
philschmid/t5-11b-sharded
这是对 t5-11b 的改进版本,使用一个自定义的handler.py作为在单个NVIDIA T4上使用 inference-endpoints 与t5-11b的示例。
Hugging Face推断端点可以与任何语言中的HTTP客户端一起使用。我们将使用Python和requests库来发送请求(确保已经安装了它 pip install requests)。
import json import requests as r ENDPOINT_URL=""# url of your endpoint HF_TOKEN="" # payload samples regular_payload = { "inputs": "translate English to German: The weather is nice today." } parameter_payload = { "inputs": "translate English to German: Hello my name is Philipp and I am a Technical Leader at Hugging Face", "parameters" : { "max_length": 40, } } # HTTP headers for authorization headers= { "Authorization": f"Bearer {HF_TOKEN}", "Content-Type": "application/json" } # send request response = r.post(ENDPOINT_URL, headers=headers, json=paramter_payload) generated_text = response.json() print(generated_text)