Add Performance settings pane with opt-in LLM latency tracking#471
Merged
Conversation
New "Performance" sidebar entry with an off-by-default toggle that, when enabled, records the timestamp, model name, and latency_ms of every LLM generation into a 100-entry ring buffer persisted in UserDefaults. The router records on the engine boundary so both Apple Intelligence and llama paths (including the AI->llama locale fallback) are captured under the same gate. The pane renders the buffer as a newest-first table with a Clear button.
The persist path silently swallowed JSONEncoder errors, which would have made "metrics vanish between sessions" undiagnosable. The encode is not expected to fail for this Codable (UUID/Date/String/Int only), so any real failure here points at something fundamentally wrong — log at error level via CotabbyLogger.app so it surfaces in cotabby.jsonl. Addresses Greptile P2 on FuJacob#471.
The previous regenerate was contaminated by an Xcode auto-touch that swapped DEVELOPMENT_TEAM into per-config build settings instead of the project-level TargetAttributes that xcodegen emits from project.yml. CI's drift guard caught the divergence. Re-running xcodegen produces the canonical output; no project.yml changes needed.
FuJacob
approved these changes
May 31, 2026
Owner
FuJacob
left a comment
There was a problem hiding this comment.
Thanks for this!! this is definitely gonna help a lot
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new "Performance" Settings pane with an off-by-default toggle that records the timestamp, model name, and
latency_msof every LLM generation into a 100-entry ring buffer (persisted as JSON inUserDefaults). When tracking is on, both engines and the AI→llama locale fallback are captured at the router boundary, so the same gate controls all paths.This is purely informational — the captured data is shown only inside the Performance pane and is intended to help users (and contributors) eyeball variations in per-model latency, so they can compare how different llama models or Apple Intelligence perform on their own machine. Nothing in the autocomplete pipeline reads these metrics back; they don't influence routing, ranking, or any user-facing behavior.
Validation
UI checked manually: opened Settings → Performance, flipped the toggle on, triggered suggestions, watched entries appear newest-first with monospaced timestamp/duration columns and an enabled Clear button. Toggling off short-circuits recording at the router; existing entries remain until cleared. Empty state copy switches between "tracking is off" and "no requests yet" based on the toggle.
swiftlint lint --quietnot run locally (binary not installed on this machine); CI will gate.Linked issues
None.
Risk / rollout notes
cotabbyPerformanceTrackingEnableddefaults tofalseand the ring buffer is empty on first launch; existing users see no behavior change until they opt in.cotabbyPerformanceTrackingEnabled: BoolcotabbyPerformanceMetricEntries: Data(JSON-encoded[PerformanceMetricEntry], capped at 100)Boolread and returns — no allocation, no UserDefaults write.SuggestionEngineRouternow takesperformanceMetricsStoreand allamaModelNameProviderclosure;SettingsCoordinator/SettingsContainerViewthread a sharedPerformanceMetricsStore.LlamaRuntimeManagergains a read-onlycurrentModelFilenameaccessor so the recorder can label entries with the actual GGUF filename.xcodegen generatefor the two new files (PerformanceMetricsStore.swift,PerformancePaneView.swift). Auto-discovery handled them; no manualproject.ymledits.Greptile Summary
Adds an opt-in "Performance" Settings pane that records the timestamp, model name, and
latency_msof every LLM generation into a 100-entry ring buffer persisted viaUserDefaults. The toggle defaults tofalse, so no existing user sees any behavior change.PerformanceMetricsStoreis a new@MainActorObservableObjectthat owns the ring buffer;SuggestionEngineRouter(already@MainActor) callsrecordPerformanceMetricafter every successful engine response — covering the direct Apple Intelligence path, the direct llama path, and the locale-fallback llama path.PerformancePaneViewrenders a newest-first table of captured entries with inline empty-state copy that distinguishes "tracking is off" from "no entries yet", plus aClearbutton that removes theUserDefaultsblob.SuggestionSettingsModelgains aisPerformanceTrackingEnabledflag (defaultsfalse) wired toUserDefaults;SettingsCoordinator,SettingsContainerView, andCotabbyAppEnvironmentare updated to thread the shared store through composition.Confidence Score: 5/5
Safe to merge — the recording gate defaults off, the hot path is a single bool read, and actor isolation is handled correctly throughout.
The feature is entirely additive and off by default. The router is correctly @mainactor, so all calls into PerformanceMetricsStore are properly isolated. The ring buffer cap keeps UserDefaults writes bounded and infrequent. No existing behaviour is changed for users who never visit the Performance pane.
No files require special attention; the only note is a cosmetic usability gap in PerformancePaneView's timestamp formatter.
Important Files Changed
Sequence Diagram
sequenceDiagram participant SC as SuggestionCoordinator participant SER as SuggestionEngineRouter (@MainActor) participant FM as FoundationModelEngine participant LE as LlamaEngine participant PMS as PerformanceMetricsStore participant UD as UserDefaults SC->>SER: generateSuggestion(request) alt "selectedEngine == .appleIntelligence" SER->>FM: generateSuggestion(request) FM-->>SER: SuggestionResult (latency) SER->>SER: recordPerformanceMetric(Apple Intelligence, latency) SER->>PMS: record(modelName:latencyMs:) [if tracking enabled] PMS->>UD: persist JSON blob else unsupportedLanguageOrLocale fallback SER->>FM: generateSuggestion(request) FM-->>SER: throws unsupportedLanguageOrLocale SER->>LE: generateSuggestion(request) LE-->>SER: SuggestionResult (latency) SER->>SER: recordPerformanceMetric(llamaModelName, latency) SER->>PMS: record(modelName:latencyMs:) [if tracking enabled] PMS->>UD: persist JSON blob else "selectedEngine == .llamaOpenSource" SER->>LE: generateSuggestion(request) LE-->>SER: SuggestionResult (latency) SER->>SER: recordPerformanceMetric(llamaModelName, latency) SER->>PMS: record(modelName:latencyMs:) [if tracking enabled] PMS->>UD: persist JSON blob end SER-->>SC: SuggestionResultReviews (3): Last reviewed commit: "Regenerate Cotabby.xcodeproj to match pr..." | Re-trigger Greptile