Skip to content

fix(resolveGgmlTypeOption): accept case-insensitive ggml type names#607

Open
andreinknv wants to merge 1 commit into
withcatai:masterfrom
andreinknv:fix/ggml-type-option-case-insensitive
Open

fix(resolveGgmlTypeOption): accept case-insensitive ggml type names#607
andreinknv wants to merge 1 commit into
withcatai:masterfrom
andreinknv:fix/ggml-type-option-case-insensitive

Conversation

@andreinknv
Copy link
Copy Markdown

@andreinknv andreinknv commented May 14, 2026

Summary

resolveGgmlTypeOption("q8_0") currently returns undefined. The lookup uses Object.hasOwn(GgmlType, option) against the uppercase-by-convention enum keys (Q8_0, F16, BF16, …), so any lowercase/mixed-case input falls through unrecognized. The caller then quietly defaults to F16 with no warning — a silent, hard-to-debug "why is my Q8 KV cache behaving like FP16?" failure.

This PR uppercases the string input before the lookup. Numeric GgmlType values still resolve unchanged. Invalid strings still return undefined.

Why

Two callsites that consume this string:

  1. LlamaContextOptions.experimentalKvCacheKeyType / experimentalKvCacheValueType — operators set these from config files, where lowercase quant identifiers (q8_0, q4_0, bf16) are common because GGUF filenames carry them in that case (e.g. qwen2.5-coder-7b-instruct-q4_k_m.gguf).

  2. CLI flags --kv-cache-key-type / --kv-cache-value-type — same expectation; users mirror the lowercase GGUF naming.

Reproduction (matches what surfaced while benchmarking a quantized-KV configuration): pass "q8_0" to experimentalKvCacheKeyType, run a context, observe VRAM/perf numbers that match the F16 baseline instead of Q8. No error is raised — the silent fallback makes it look like the override is working when it isn't.

After

resolveGgmlTypeOption("q8_0")  // => GgmlType.Q8_0  (was: undefined)
resolveGgmlTypeOption("Q8_0")  // => GgmlType.Q8_0  (unchanged)
resolveGgmlTypeOption("bf16")  // => GgmlType.BF16  (was: undefined)
resolveGgmlTypeOption("nope")  // => undefined      (unchanged)
resolveGgmlTypeOption(undefined)  // => undefined  (unchanged)
resolveGgmlTypeOption(13 /* GgmlType.Q8_0 */)  // => GgmlType.Q8_0 (unchanged)

Compatibility

  • Strings that already worked still work (uppercased-first is idempotent for already-upper input).
  • Strings that previously returned undefined now resolve when their uppercase form is a valid key. This is a strict superset of the prior accepted-input set — no caller that depended on undefined for those inputs should exist (the API contract is "valid → enum value, invalid → undefined").
  • No public-API breakage.

Test plan

  • Manual: resolveGgmlTypeOption("q8_0") now returns GgmlType.Q8_0; "Q8_0" still does.
  • Manual: invalid strings ("nope", "") still return undefined.
  • CI on this PR.

🤖 Generated with Claude Code

Previously `resolveGgmlTypeOption("q8_0")` returned `undefined` because
the lookup used `Object.hasOwn(GgmlType, option)` against the
uppercase-by-convention enum keys (`Q8_0`, `F16`, `BF16`, ...). The
caller then fell back to F16 silently — no warning, no error — making
this a frustrating-to-debug "why is my Q8 KV cache acting like FP16"
class of issue.

This change uppercases the input string before the lookup. Numeric
GgmlType enum values are still accepted unchanged. Empty / null /
unrecognized values still return `undefined`.

Affects:
  - `experimentalKvCacheKeyType` / `experimentalKvCacheValueType` on
    LlamaContextOptions and LlamaModelOptions
  - The CLI's `--kv-cache-key-type` / `--kv-cache-value-type` flags
    (`resolveCommandGgufPath.ts`)

After:
  resolveGgmlTypeOption("q8_0")  // => GgmlType.Q8_0
  resolveGgmlTypeOption("Q8_0")  // => GgmlType.Q8_0  (unchanged)
  resolveGgmlTypeOption("bf16")  // => GgmlType.BF16
  resolveGgmlTypeOption("nope")  // => undefined      (unchanged)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant