安装命令行工具
以下所有操作都是通过dbgpt命令来完成。 要使用dbgpt命令,首先需要安装DB-GPT项目, 可以通过如下命令安装
$ pip install -e ".[default]"
同时也可以通过脚本模式来使用
$ python dbgpt/cli/cli_scripts.py
启动Model Controller
$ dbgpt start controller
默认情况下,Model Server会启动在8000端口
启动Model Worker
:::color2 启动chatglm2-6b模型Worker
:::
dbgpt start worker --model_name chatglm2-6b \--model_path /app/models/chatglm2-6b \--port 8001 \--controller_addr http://127.0.0.1:8000
:::color2 启动vicuna-13b-v1.5模型Worker
:::
dbgpt start worker --model_name vicuna-13b-v1.5 \--model_path /app/models/vicuna-13b-v1.5 \--port 8002 \--controller_addr http://127.0.0.1:8000
:::danger ⚠️ 注意:确保使用您自己的模型名称和模型路径。
:::
启动Embedding模型服务
dbgpt start worker --model_name text2vec \--model_path /app/models/text2vec-large-chinese \--worker_type text2vec \--port 8003 \--controller_addr http://127.0.0.1:8000
:::danger
⚠️ 注意:确保使用您自己的模型名称和模型路径。:::
:::success 查看并检查已部署模型
:::
$ dbgpt model list+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+| Model Name | Model Type | Host | Port | Healthy | Enabled | Prompt Template | Last Heartbeat |+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+| chatglm2-6b | llm | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.287654 || WorkerManager | service | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.286668 || WorkerManager | service | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.845617 || WorkerManager | service | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.598439 || text2vec | text2vec | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.844796 || vicuna-13b-v1.5 | llm | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.597775 |+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
使用模型服务
如上部署好的模型服务,可以通过dbgpt_server来使用。首先修改.env配置文件来更改连接模型地址。
LLM_MODEL=vicuna-13b-v1.5# The current default MODEL_SERVER address is the address of the Model ControllerMODEL_SERVER=http://127.0.0.1:8000
启动Webserver
dbgpt start webserver --light
—light 表示不启动嵌入式模型服务。
或者直接可以通过命令制定模型的方式启动。
LLM_MODEL=chatglm2-6b dbgpt start webserver --light
使用示例
这是一个通过集群方式部署Embedding模型并使用的完整命令行样例程序。
- 第一步: 启动
Controller Server
dbgpt start controller
- 第二步: 启动向量模型Worker
# 2. second start embedding model workerdbgpt start worker --model_name text2vec \--model_path /app/models/text2vec-large-chinese \--worker_type text2vec \--port 8003 \--controller_addr http://127.0.0.1:8000
- 第三步:启动
apiserver
dbgpt start apiserver --controller_addr http://127.0.0.1:8000 --api_keys EMPTY
- 第四步: 测试使用
curl http://127.0.0.1:8100/api/v1/embeddings \-H "Authorization: Bearer EMPTY" \-H "Content-Type: application/json" \-d '{"model": "text2vec","input": "Hello world!"}'
- 通过包的方式代码引入使用
from dbgpt.rag.embedding import OpenAPIEmbeddingsopenai_embeddings = OpenAPIEmbeddings(api_url="http://localhost:8100/api/v1/embeddings",api_key="EMPTY",model_name="text2vec",)texts = ["Hello, world!", "How are you?"]openai_embeddings.embed_documents(texts)
命令行用法
更多关于命令行的使用,可以通过查看命令行帮助进行使用, 如下是一个参考样例
:::color1 查看dbgpt帮助 dbgpt —help
:::
dbgpt --helpAlready connect 'dbgpt'Usage: dbgpt [OPTIONS] COMMAND [ARGS]...Options:--log-level TEXT Log level--version Show the version and exit.--help Show this message and exit.Commands:install Install dependencies, plugins, etc.knowledge Knowledge command line toolmodel Clients that manage model servingstart Start specific server.stop Start specific server.trace Analyze and visualize trace spans.
:::color1 查看dbgpt 启动命令 dbgpt start —help
:::
dbgpt start --helpAlready connect 'dbgpt'Usage: dbgpt start [OPTIONS] COMMAND [ARGS]...Start specific server.Options:--help Show this message and exit.Commands:apiserver Start apiservercontroller Start model controllerwebserver Start webserver(dbgpt_server.py)worker Start model worker(dbgpt_env) magic@B-4TMH9N3X-2120 ~ %
:::color1 查看dbgpt 启动模型服务帮助命令 dbgpt start worker —help
:::
dbgpt start worker --helpAlready connect 'dbgpt'Usage: dbgpt start worker [OPTIONS]Start model workerOptions:--model_name TEXT Model name [required]--model_path TEXT Model path [required]--worker_type TEXT Worker type--worker_class TEXT Model worker class,pilot.model.cluster.DefaultModelWorker--model_type TEXT Model type: huggingface, llama.cpp, proxyand vllm [default: huggingface]--host TEXT Model worker deploy host [default: 0.0.0.0]--port INTEGER Model worker deploy port [default: 8001]--daemon Run Model Worker in background--limit_model_concurrency INTEGERModel concurrency limit [default: 5]--standalone Standalone mode. If True, embedded RunModelController--register Register current worker to model controller[default: True]--worker_register_host TEXT The ip address of current worker to registerto ModelController. If None, the address isautomatically determined--controller_addr TEXT The Model controller address to register--send_heartbeat Send heartbeat to model controller[default: True]--heartbeat_interval INTEGER The interval for sending heartbeats(seconds) [default: 20]--log_level TEXT Logging level--log_file TEXT The filename to store log [default:dbgpt_model_worker_manager.log]--tracer_file TEXT The filename to store tracer span records[default:dbgpt_model_worker_manager_tracer.jsonl]--tracer_storage_cls TEXT The storage class to storage tracer spanrecords--device TEXT Device to run model. If None, the device isautomatically determined--prompt_template TEXT Prompt template. If None, the prompttemplate is automatically determined frommodel path, supported template: zero_shot,vicuna_v1.1,llama-2,codellama,alpaca,baichuan-chat,internlm-chat--max_context_size INTEGER Maximum context size [default: 4096]--num_gpus INTEGER The number of gpus you expect to use, if itis empty, use all of them as much aspossible--max_gpu_memory TEXT The maximum memory limit of each GPU, onlyvalid in multi-GPU configuration--cpu_offloading CPU offloading--load_8bit 8-bit quantization--load_4bit 4-bit quantization--quant_type TEXT Quantization datatypes, `fp4` (four bitfloat) and `nf4` (normal four bit float),only valid when load_4bit=True [default:nf4]--use_double_quant Nested quantization, only valid whenload_4bit=True [default: True]--compute_dtype TEXT Model compute type--trust_remote_code Trust remote code [default: True]--verbose Show verbose output.--help Show this message and exit.
:::color1 查看dbgpt模型服务相关命令 dbgpt model —help
:::
dbgpt model --helpAlready connect 'dbgpt'Usage: dbgpt model [OPTIONS] COMMAND [ARGS]...Clients that manage model servingOptions:--address TEXT Address of the Model Controller to connect to. Just supportlight deploy model, If the environment variableCONTROLLER_ADDRESS is configured, read from the environmentvariable--help Show this message and exit.Commands:chat Interact with your bot from the command linelist List model instancesrestart Restart model instancesstart Start model instancesstop Stop model instances
