You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The model would be launched using the [default parameters](vec_inf/models/models.csv), you can override these values by providing additional options, use `--help` to see the full list. You can also launch your own customized model as long as the model architecture is [supported by vLLM](https://docs.vllm.ai/en/stable/models/supported_models.html), you'll need to specify all model launching related options to run a successful run.
21
+
The model would be launched using the [default parameters](vec_inf/models/models.csv), you can override these values by providing additional parameters, use `--help` to see the full list. You can also launch your own customized model as long as the model architecture is [supported by vLLM](https://docs.vllm.ai/en/stable/models/supported_models.html), and make sure to follow the instructions below:
22
+
* Your model weights directory naming convention should follow `$MODEL_FAMILY-$MODEL_VARIANT`.
23
+
* Your model weights directory should contain HF format weights.
24
+
* The following launch parameters will conform to default value if not specified: `--max-num-seqs`, `--partition`, `--data-type`, `--venv`, `--log-dir`, `--model-weights-parent-dir`, `--pipeline-parallelism`, `--enforce-eager`. All other launch parameters need to be specified for custom models.
25
+
* Example for setting the model weights parent directory: `--model-weights-parent-dir /h/user_name/my_weights`.
26
+
* For other model launch parameters you can reference the default values for similar models using the [`list` command ](#list-command).
21
27
28
+
### `status` command
22
29
You can check the inference server status by providing the Slurm job ID to the `status` command:
23
30
```bash
24
31
vec-inf status 13014393
@@ -38,24 +45,36 @@ There are 5 possible states:
38
45
39
46
Note that the base URL is only available when model is in `READY` state, and if you've changed the Slurm log directory path, you also need to specify it when using the `status` command.
40
47
48
+
### `metrics` command
49
+
Once your server is ready, you can check performance metrics by providing the Slurm job ID to the `metrics` command:
50
+
```bash
51
+
vec-inf metrics 13014393
52
+
```
53
+
54
+
And you will see the performance metrics streamed to your console, note that the metrics are updated with a 10-second interval.
Copy file name to clipboardExpand all lines: vec_inf/README.md
+2-1Lines changed: 2 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,8 @@
1
1
# `vec-inf` Commands
2
2
3
3
*`launch`: Specify a model family and other optional parameters to launch an OpenAI compatible inference server, `--json-mode` supported. Check [`here`](./models/README.md) for complete list of available options.
4
-
*`list`: List all available model names, `--json-mode` supported.
4
+
*`list`: List all available model names, or append a supported model name to view the default configuration, `--json-mode` supported.
5
+
*`metrics`: Streams performance metrics to the console.
5
6
*`status`: Check the model status by providing its Slurm job ID, `--json-mode` supported.
6
7
*`shutdown`: Shutdown a model by providing its Slurm job ID.
0 commit comments