OpenAI API compatible ChatGLM 6B

Image Details

Image name: modelzai/llm-chatglm-6b (opens in a new tab)
Repository: tensorchord/modelz-llm (opens in a new tab)
Dockerfile: Dockerfile (opens in a new tab)

Deployment

`mdz`

You could deploy with the following command:

$ mdz deploy --image modelzai/llm-chatglm-6b:23.07.4 --name llm
Inference llm is created
$ mdz list
 NAME  ENDPOINT                                                 STATUS  INVOCATIONS  REPLICAS 
 llm   http://llm-qh2n0y28ybqc36oc.146.235.213.84.modelz.live   Ready           174  1/1      
       http://146.235.213.84.modelz.live/inference/llm.default

You could access the deployment by visiting the endpoint URL. The endpoint will be automatically generated for each deployment with the following format: <name>-<random-string>.<ip>.modelz.live.

It is http://llm-qh2n0y28ybqc36oc.146.235.213.84.modelz.live in this case. The endpoint could be accessed from the outside world as well if you've provided the public IP address of your server to the mdz server start command.

How to use

You could use OpenAI python package to access the endpoint:

import openai
openai.api_base="http://llm-qh2n0y28ybqc36oc.146.235.213.84.modelz.live"
openai.api_key="any"
 
# create a chat completion
chat_completion = openai.ChatCompletion.create(model="any", messages=[
    {"role": "user", "content": "Who are you?"},
    {"role": "assistant", "content": "I am a student"},
    {"role": "user", "content": "What do you learn?"},
], max_tokens=100)

ChatGLM 6B INT4 FastChat T5 3B