streaming cost updates#77
Open
dixitaniket wants to merge 1 commit into
Open
Conversation
Collaborator
dixitaniket
commented
May 21, 2026
- cost for streaming response
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a streaming-only billing side-channel to /v1/ohttp chunked responses so the relay can learn settled cost after the stream completes.
Changes:
- Introduces a private billing-frame marker (
OHTTP_BILLING_FRAME_MAGIC) and_build_billing_frame()to serializeSessionCostinto a plaintext frame. - Updates the streaming response generator to parse the final SSE JSON, set inner cost context, and (optionally) inject the billing frame before emitting the final encrypted chunk.
Comments suppressed due to low confidence (1)
tee_gateway/controllers/ohttp_controller.py:297
response_jsonis parsed fromb"".join(plaintext_chunks)here, and then_set_inner_stream_cost_context()in thefinallyblock joins and parses the same chunks again. For large streams this duplicates O(n) copying/parsing and retains two full concatenations in memory. Consider refactoring so the SSE parse happens once (e.g., computeresponse_jsononce and pass it into the final context setter, or skip the_set_inner_stream_cost_contextcall whenresponse_jsonhas already been derived).
response_json = _parse_final_sse_json(b"".join(plaintext_chunks))
_set_inner_cost_context(
cost_context,
response_json=response_json,
status_code=status,
)
billing_frame = _build_billing_frame(response_json)
if billing_frame:
yield billing_frame
# Always emit exactly one final chunk so the AAD=b"final"
# marker is present — that's what protects clients from
# undetected truncation.
yield encrypter.encrypt_chunk(pending or b"", is_final=True)
finally:
_set_inner_stream_cost_context(
cost_context, plaintext_chunks, status_code=status
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ) | ||
| billing_frame = _build_billing_frame(response_json) | ||
| if billing_frame: | ||
| yield billing_frame |
Comment on lines
+286
to
+288
| billing_frame = _build_billing_frame(response_json) | ||
| if billing_frame: | ||
| yield billing_frame |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.