OpenAI API compatible FastChat T5 3B
Image Details
- Image name:
modelzai/llm-fastchat-t5-3b
(opens in a new tab) - Repository: tensorchord/modelz-llm (opens in a new tab)
- Dockerfile: Dockerfile (opens in a new tab)
Deployment
mdz
You could deploy with the following command:
$ mdz deploy --image modelzai/llm-fastchat-t5-3b:23.07.4 --name llm
Inference llm is created
$ mdz list
NAME ENDPOINT STATUS INVOCATIONS REPLICAS
llm http://llm-qh2n0y28ybqc36oc.146.235.213.84.modelz.live Ready 174 1/1
http://146.235.213.84.modelz.live/inference/llm.default
You could access the deployment by visiting the endpoint URL. The endpoint will be automatically generated for each deployment with the following format: <name>-<random-string>.<ip>.modelz.live
.
It is http://llm-qh2n0y28ybqc36oc.146.235.213.84.modelz.live
in this case. The endpoint could be accessed from the outside world as well if you've provided the public IP address of your server to the mdz server start
command.
How to use
You could use OpenAI python package to access the endpoint:
import openai
openai.api_base="http://llm-qh2n0y28ybqc36oc.146.235.213.84.modelz.live"
openai.api_key="any"
# create a chat completion
chat_completion = openai.ChatCompletion.create(model="any", messages=[
{"role": "user", "content": "Who are you?"},
{"role": "assistant", "content": "I am a student"},
{"role": "user", "content": "What do you learn?"},
], max_tokens=100)