[None][feat] Add KV cache prefetch#14748
Conversation
Signed-off-by: Yao Yao <lowsfer@users.noreply.github.com>
📝 WalkthroughWalkthroughThis PR introduces a ChangesKV Cache Prefetch Implementation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/unittest/kv_cache_manager_v2_tests/test_kv_cache_manager_v2.py (1)
1278-1293: ⚡ Quick winCoverage is strong for the happy path, but failure-path coverage is still insufficient.
Please add negative tests in
tests/unittest/kv_cache_manager_v2_tests/test_kv_cache_manager_v2.pyfor: (1) callingprefetchwhen cache status is notSUSPENDED, and (2) migration failure path to verify eviction scheduling is restored correctly after exceptions.As per coding guidelines: “tests/**: Act as a QA engineer reviewing test changes and coverage… suggest concrete list file names and whether coverage is sufficient, insufficient, or needs follow-up outside the PR.”
Also applies to: 1348-1359
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unittest/kv_cache_manager_v2_tests/test_kv_cache_manager_v2.py` around lines 1278 - 1293, Summary: add two negative tests to cover prefetch when cache status is not SUSPENDED and the migration failure path to ensure eviction scheduling is restored. Add one test in tests/unittest/kv_cache_manager_v2_tests/test_kv_cache_manager_v2.py that sets the cache status to a non-SUSPENDED value and calls manager.prefetch (or _KVCache.prefetch) and asserts the call either raises the expected error or leaves state unchanged (no pages prefetched) by inspecting _KVCache._active_pages()/_page() counts; add a second test that simulates a migration exception by monkeypatching the migration helper (e.g., the method used during prefetch/migrate) to raise, call prefetch/migrate inside a pytest.raises context, and after the exception assert that pages at HOST_LEVEL that manager._storage.is_evictable(page) still have page.scheduled_for_eviction True (reuse assert_prefetched_pages_are_evictable logic or check _page/_active_pages directly) to verify eviction scheduling is restored.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/unittest/kv_cache_manager_v2_tests/test_kv_cache_manager_v2.py`:
- Around line 1278-1293: Summary: add two negative tests to cover prefetch when
cache status is not SUSPENDED and the migration failure path to ensure eviction
scheduling is restored. Add one test in
tests/unittest/kv_cache_manager_v2_tests/test_kv_cache_manager_v2.py that sets
the cache status to a non-SUSPENDED value and calls manager.prefetch (or
_KVCache.prefetch) and asserts the call either raises the expected error or
leaves state unchanged (no pages prefetched) by inspecting
_KVCache._active_pages()/_page() counts; add a second test that simulates a
migration exception by monkeypatching the migration helper (e.g., the method
used during prefetch/migrate) to raise, call prefetch/migrate inside a
pytest.raises context, and after the exception assert that pages at HOST_LEVEL
that manager._storage.is_evictable(page) still have page.scheduled_for_eviction
True (reuse assert_prefetched_pages_are_evictable logic or check
_page/_active_pages directly) to verify eviction scheduling is restored.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f6a39114-ede4-4c88-9b02-9ec33c8500e9
📒 Files selected for processing (4)
tensorrt_llm/runtime/kv_cache_manager_v2/__init__.pyitensorrt_llm/runtime/kv_cache_manager_v2/_core/_kv_cache.pytensorrt_llm/runtime/kv_cache_manager_v2/_storage_manager.pytests/unittest/kv_cache_manager_v2_tests/test_kv_cache_manager_v2.py
|
/bot run --disable-fail-fast |
|
PR_Github #51176 [ run ] triggered by Bot. Commit: |
Summary by CodeRabbit
New Features
Tests
Description
Adds a suspended KV cache prefetch path so callers can migrate pages to a target cache level, such as staging disk-resident pages into host memory before
resume()needs HBM.The storage prefetch path prepares target-level slots, migrates pages from lower tiers, and restores eviction scheduling for pages that remain evictable after migration.
Test Coverage
LD_LIBRARY_PATH=/home/yaoy/tekit/tensorrt_llm/libs PYTHONPATH=/home/yaoy/tekit/tensorrt_llm/runtime/ python /home/yaoy/tekit/tests/unittest/kv_cache_manager_v2_tests/test_kv_cache_manager_v2.py TestResizeQuota.test_resize_quota -vPR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either
api-compatibleorapi-breaking. Forapi-breaking, includeBREAKINGin the PR title.Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.