throwaway: NLD latency instrumentation N=2000 (do not merge)#12676
Draft
evelyn-with-warp wants to merge 1 commit into
Draft
throwaway: NLD latency instrumentation N=2000 (do not merge)#12676evelyn-with-warp wants to merge 1 commit into
evelyn-with-warp wants to merge 1 commit into
Conversation
Add timing instrumentation to detect_and_set_input_type to measure nld_prompt_history() clone cost and most_recent_close_match scan cost. Emits NLD_LATENCY log lines. This is a throwaway branch for measurement only. Co-Authored-By: Oz <oz-agent@warp.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
NLD History-Match Latency: N=2000 Measurement (throwaway)
DO NOT MERGE — This is a throwaway instrumentation branch for the NLD latency stress test (N=2000). See plan for context.
What this does
Adds
Instanttiming aroundnld_prompt_history()clone andmost_recent_close_matchprompt scan indetect_and_set_input_type, emitting aNLD_LATENCYgrep-able log line per detection.Results (N=2000, in-process Rust benchmark)
Full-query scan (prompt_scan_us p50/p90):
Curl URL intermediate spike (buffer ≈ seed length ~39-42 chars):
Conversation: https://staging.warp.dev/conversation/51ba4553-dc0d-4479-bac4-36983518d0b6
Run: https://oz.staging.warp.dev/runs/019ecd70-9a72-77a4-8555-8631bdfc5edb
This PR was generated with Oz.