Skip to content

Commit f43d7bf

Browse files
authored
Merge pull request #10 from VectorInstitute/develop
Develop
2 parents 156dfa5 + b426e7e commit f43d7bf

File tree

17 files changed

+171
-113
lines changed

17 files changed

+171
-113
lines changed

README.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,9 @@ vec-inf launch Meta-Llama-3.1-8B-Instruct
1515
```
1616
You should see an output like the following:
1717

18-
<img width="450" alt="launch_img" src="https://github.com/user-attachments/assets/557eb421-47db-4810-bccd-c49c526b1b43">
18+
<img width="400" alt="launch_img" src="https://github.com/user-attachments/assets/557eb421-47db-4810-bccd-c49c526b1b43">
1919

20-
The model would be launched using the [default parameters](vec-inf/models/models.csv), you can override these values by providing additional options, use `--help` to see the full list.
21-
If you'd like to see the Slurm logs, they are located in the `.vec-inf-logs` folder in your home directory. The log folder path can be modified by using the `--log-dir` option.
20+
The model would be launched using the [default parameters](vec-inf/models/models.csv), you can override these values by providing additional options, use `--help` to see the full list. You can also launch your own customized model as long as the model architecture is [supported by vLLM](https://docs.vllm.ai/en/stable/models/supported_models.html), you'll need to specify all model launching related options to run a successful run.
2221

2322
You can check the inference server status by providing the Slurm job ID to the `status` command:
2423
```bash
@@ -27,7 +26,7 @@ vec-inf status 13014393
2726

2827
You should see an output like the following:
2928

30-
<img width="450" alt="status_img" src="https://github.com/user-attachments/assets/7385b9ca-9159-4ca9-bae2-7e26d80d9747">
29+
<img width="400" alt="status_img" src="https://github.com/user-attachments/assets/7385b9ca-9159-4ca9-bae2-7e26d80d9747">
3130

3231
There are 5 possible states:
3332

@@ -52,6 +51,12 @@ vec-inf list
5251
```
5352
<img width="1200" alt="list_img" src="https://github.com/user-attachments/assets/a4f0d896-989d-43bf-82a2-6a6e5d0d288f">
5453

54+
You can also view the default setup for a specific supported model by providing the model name, for example `Meta-Llama-3.1-70B-Instruct`:
55+
```bash
56+
vec-inf list Meta-Llama-3.1-70B-Instruct
57+
```
58+
<img width="400" alt="list_model_img" src="https://github.com/user-attachments/assets/5dec7a33-ba6b-490d-af47-4cf7341d0b42">
59+
5560
`launch`, `list`, and `status` command supports `--json-mode`, where the command output would be structured as a JSON string.
5661

5762
## Send inference requests

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[tool.poetry]
22
name = "vec-inf"
3-
version = "0.3.0"
3+
version = "0.3.1"
44
description = "Efficient LLM inference on Slurm clusters using vLLM."
55
authors = ["Marshall Wang <marshall.wang@vectorinstitute.ai>"]
66
license = "MIT license"

vec_inf/cli/_cli.py

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
import os
22

33
import click
4-
import pandas as pd
54
from rich.console import Console
65
from rich.columns import Columns
76
from rich.panel import Panel
@@ -27,12 +26,12 @@ def cli():
2726
@click.option(
2827
"--model-family",
2928
type=str,
30-
help='The model family name according to the directories in `models`'
29+
help='The model family'
3130
)
3231
@click.option(
3332
"--model-variant",
3433
type=str,
35-
help='The model variant according to the README in `models/model-family`'
34+
help='The model variant'
3635
)
3736
@click.option(
3837
"--max-model-len",
@@ -57,12 +56,12 @@ def cli():
5756
@click.option(
5857
"--qos",
5958
type=str,
60-
help='Quality of service, default to m3'
59+
help='Quality of service, default depends on suggested resource allocation required for the model'
6160
)
6261
@click.option(
6362
"--time",
6463
type=str,
65-
help='Time limit for job, this should comply with QoS, default to 4:00:00'
64+
help='Time limit for job, this should comply with QoS, default to max walltime of the chosen QoS'
6665
)
6766
@click.option(
6867
"--data-type",
@@ -77,7 +76,7 @@ def cli():
7776
@click.option(
7877
"--log-dir",
7978
type=str,
80-
help='Path to slurm log directory'
79+
help='Path to slurm log directory, default to .vec-inf-logs in home directory'
8180
)
8281
@click.option(
8382
"--json-mode",
@@ -150,7 +149,7 @@ def launch(
150149
@click.option(
151150
"--log-dir",
152151
type=str,
153-
help='Path to slurm log directory. This is required if it was set when launching the model'
152+
help='Path to slurm log directory. This is required if --log-dir was set in model launch'
154153
)
155154
@click.option(
156155
"--json-mode",
@@ -238,16 +237,40 @@ def shutdown(slurm_job_id: int) -> None:
238237

239238

240239
@cli.command("list")
240+
@click.argument(
241+
"model-name",
242+
required=False)
241243
@click.option(
242244
"--json-mode",
243245
is_flag=True,
244246
help='Output in JSON string',
245247
)
246-
def list(json_mode: bool=False) -> None:
248+
def list(model_name: str=None, json_mode: bool=False) -> None:
247249
"""
248-
List all available models
250+
List all available models, or get default setup of a specific model
249251
"""
250252
models_df = load_models_df()
253+
254+
if model_name:
255+
if model_name not in models_df['model_name'].values:
256+
raise ValueError(f"Model name {model_name} not found in available models")
257+
258+
excluded_keys = {'venv', 'log_dir', 'pipeline_parallelism'}
259+
model_row = models_df.loc[models_df['model_name'] == model_name]
260+
261+
if json_mode:
262+
# click.echo(model_row.to_json(orient='records'))
263+
filtered_model_row = model_row.drop(columns=excluded_keys, errors='ignore')
264+
click.echo(filtered_model_row.to_json(orient='records'))
265+
return
266+
table = create_table(key_title="Model Config", value_title="Value")
267+
for _, row in model_row.iterrows():
268+
for key, value in row.items():
269+
if key not in excluded_keys:
270+
table.add_row(key, str(value))
271+
CONSOLE.print(table)
272+
return
273+
251274
if json_mode:
252275
click.echo(models_df['model_name'].to_json(orient='records'))
253276
return

vec_inf/launch_server.sh

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ while [[ "$#" -gt 0 ]]; do
2222
shift
2323
done
2424

25-
required_vars=(model_family model_variant partition qos walltime num_nodes num_gpus max_model_len vocab_size data_type virtual_env log_dir pipeline_parallelism)
25+
required_vars=(model_family model_variant partition qos walltime num_nodes num_gpus max_model_len vocab_size)
2626

2727
for var in "$required_vars[@]"; do
2828
if [ -z "$!var" ]; then
@@ -40,10 +40,28 @@ export NUM_NODES=$num_nodes
4040
export NUM_GPUS=$num_gpus
4141
export VLLM_MAX_MODEL_LEN=$max_model_len
4242
export VLLM_MAX_LOGPROBS=$vocab_size
43-
export VLLM_DATA_TYPE=$data_type
44-
export VENV_BASE=$virtual_env
45-
export LOG_DIR=$log_dir
46-
export PIPELINE_PARALLELISM=$pipeline_parallelism
43+
# For custom models, the following are set to default if not specified
44+
export VLLM_DATA_TYPE="auto"
45+
export VENV_BASE="singularity"
46+
export LOG_DIR="default"
47+
# Pipeline parallelism is disabled and can only be enabled if specified in models.csv as this is an experimental feature
48+
export PIPELINE_PARALLELISM="false"
49+
50+
if [ -n "$data_type" ]; then
51+
export VLLM_DATA_TYPE=$data_type
52+
fi
53+
54+
if [ -n "$virtual_env" ]; then
55+
export VENV_BASE=$virtual_env
56+
fi
57+
58+
if [ -n "$log_dir" ]; then
59+
export LOG_DIR=$log_dir
60+
fi
61+
62+
if [ -n "$pipeline_parallelism" ]; then
63+
export PIPELINE_PARALLELISM=$pipeline_parallelism
64+
fi
4765

4866
# ================================= Set default environment variables ======================================
4967
# Slurm job configuration

vec_inf/models/CodeLlama/README.md

Lines changed: 0 additions & 12 deletions
This file was deleted.

vec_inf/models/Llama-2/README.md

Lines changed: 0 additions & 10 deletions
This file was deleted.

vec_inf/models/Meta-Llama-3.1/README.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

vec_inf/models/Meta-Llama-3/README.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

vec_inf/models/Mistral/README.md

Lines changed: 0 additions & 10 deletions
This file was deleted.

vec_inf/models/Mixtral/README.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

vec_inf/models/Phi-3/README.md

Lines changed: 0 additions & 6 deletions
This file was deleted.

0 commit comments

Comments
 (0)