Skip to content

Add per-LLM-node generation_config.json path override (#4233)#4330

Open
exzile wants to merge 1 commit into
openvinotoolkit:mainfrom
exzile:feature/generation-config-path
Open

Add per-LLM-node generation_config.json path override (#4233)#4330
exzile wants to merge 1 commit into
openvinotoolkit:mainfrom
exzile:feature/generation-config-path

Conversation

@exzile

@exzile exzile commented Jun 26, 2026

Copy link
Copy Markdown

Summary

Closes #4233.

Adds an optional generation_config_path field to LLMCalculatorOptions, so several deployments backed by the same model weights can use different generation defaults without duplicating the model directory. This mirrors how graph_path already lets one model directory back several deployments with different graphs.

Behavior

  • Unsetgeneration_config.json from models_path, exactly as before (no change).
  • Set → absolute, or relative to models_path (resolved against the model directory, or its parent when models_path points at a file such as a GGUF).
  • An explicit path that does not exist is a load error (fail fast rather than silently falling back).

A shared resolveGenerationConfigPath helper is used by both the continuous-batching and legacy initializers, so behavior is identical across pipeline types (and the VLM CB servable, which reuses the CB initializer).

Placement rationale

Per the discussion in #4233 (and the config taxonomy from #4221): graph_path already lets one model directory back several deployments with different node options, so putting the override in the node options keeps the whole per-deployment configuration in graph.pbtxt.

Testing

Added LLMGenerationConfigPath.ResolveGenerationConfigPath covering default, absolute, relative, and missing-path cases. Built and ran locally on Windows (MSVC).

🤖 Generated with Claude Code

Adds an optional generation_config_path field to LLMCalculatorOptions so
several deployments backed by the same model weights can use different
generation defaults without duplicating the model directory. Mirrors how
graph_path already lets one model directory back several deployments.

When unset, generation_config.json from models_path is used as before.
An explicit path may be absolute or relative to models_path; a relative
path is resolved against the model directory (its parent when models_path
points at a file, e.g. a GGUF). An explicit path that does not exist is a
load error. Shared resolveGenerationConfigPath helper is used by both the
continuous batching and legacy initializers.

Adds a unit test covering default, absolute, relative, and missing-path cases.

Implements openvinotoolkit#4233

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@exzile exzile force-pushed the feature/generation-config-path branch from 79ec0ce to 44a52e6 Compare June 27, 2026 01:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

generation_config.json path override per LLM node

1 participant