Skip to content

AsyncPipeline.stream POC#11258

Draft
anakin87 wants to merge 16 commits into
mainfrom
streaming-poc
Draft

AsyncPipeline.stream POC#11258
anakin87 wants to merge 16 commits into
mainfrom
streaming-poc

Conversation

@anakin87
Copy link
Copy Markdown
Member

@anakin87 anakin87 commented May 5, 2026

Related Issues

Proposed Changes:

  • expose an AsyncPipeline.stream method that returns a PipelineStreamHandle
    • async iterator enabling async for chunk in handle
    • exposes a result field to get the final result

Choices and limitations:

  • at pipeline level: this unlocks most features without changing much code
  • async-only: the use case is async; trying to make it work with sync could create more problems than benefits
  • to be able to return the final result, I am not using a generator but an async iterator (PipelineStreamHandle). This currently makes the integration with Hayhooks not exactly ergonomic, but we can work to improve it

Does not contain breaking changes but it's thought for Haystack 3.

How did you test it?

Checklist

  • I have read the contributors guidelines and the code of conduct.
  • I have updated the related issue with new insights and changes.
  • I have added unit tests and updated the docstrings.
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I have documented my code.
  • I have added a release note file, following the contributors guidelines.
  • I have run pre-commit hooks and fixed any issue.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
haystack-docs Ignored Ignored Preview May 15, 2026 1:28pm

Request Review

@anakin87 anakin87 added the ignore-for-release-notes PRs with this flag won't be included in the release notes. label May 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  haystack/core/pipeline
  async_pipeline.py
  haystack/dataclasses
  streaming_chunk.py
Project Total  

This report was generated by python-coverage-comment-action

@anakin87 anakin87 changed the title Streaming poc AsyncPipeline.stream POC May 14, 2026
@anakin87
Copy link
Copy Markdown
Member Author

@mpangrazzi @sjrl I'd appreciate your thoughts

@sjrl
Copy link
Copy Markdown
Contributor

sjrl commented May 15, 2026

@anakin87 its looking good! I took a look at your demo repo and looks reasonable to me. I think @mpangrazzi can speak better on the implications of integrating this with Hayhooks. Perhaps requiring using of an AsyncIterable messes with easy integration into FastAPI? (I'm not really sure).

The major question I had is more on what this would look for the sync pipeline? You mention that it could be problematic to set up and I was wondering if you could explain more in detail what you mean by that.

Comment thread haystack/core/pipeline/async_pipeline.py
Iterate the handle to consume chunks; after iteration ends, `handle.result` holds the final pipeline output dict
(same as `run_async`).

For every async-capable component whose `run_async` accepts `streaming_callback`, a forwarder is injected
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be a little more explicit here to say that we only check if the input param is called streaming_callback in run_async. I don't think we do any type checking past that right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. I'll fix it

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread haystack/core/pipeline/async_pipeline.py
Comment thread haystack/core/pipeline/async_pipeline.py Outdated
Comment on lines +746 to +749
if is_callable_async_compatible(user_callback):
await cast(AsyncStreamingCallbackT, user_callback)(chunk)
else:
cast(SyncStreamingCallbackT, user_callback)(chunk)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting so we want to support both async and sync streaming callbacks?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in the future, but here it was wrong. I am now switching to the same select_streaming_callback utility we use everywhere.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +759 to +760
if not isinstance(comp_inputs, dict):
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain what this is catching? Do we allow non-dict entries in our data?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relic of the initial vibe-coded solution. Removing it now.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Co-authored-by: Sebastian Husch Lee <10526848+sjrl@users.noreply.github.com>
if not isinstance(comp_inputs, dict):
continue
runtime_callback: StreamingCallbackT | None = comp_inputs.get("streaming_callback")
init_callback: StreamingCallbackT | None = getattr(instance, "streaming_callback", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit fragile. E.g. it could be stored under a different attribute name. At the very least could we update the docstring The forwarder composes with any user-supplied streaming_callback. to mention how users can or should supply a streaming callback?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should we allow users to directly provide a streaming callback in the stream method?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I now tried to explain better in the docstring (b72c878). LMK if still unclear/fragile
  2. Yes, can be passed via data (now explained)

@anakin87
Copy link
Copy Markdown
Member Author

anakin87 commented May 15, 2026

The major question I had is more on what this would look for the sync pipeline? You mention that it could be problematic to set up and I was wondering if you could explain more in detail what you mean by that.

I think that this new feature provides value to async users in APIs (#8742, #9347).

For sync users, the streaming_callback mechanism already gives what they need.
So I'd simply not expose Pipeline.stream.

Why?

Stream does: a Pipeline produces chunks in the background; consumer pulls them on demand.
In async, it's natural.
In sync, it would require using another thread, with lots of problems coming for no benefit in the Haystack world: cancellation problems, exception propagation, tracing context loss, ...

In Hayhooks, motivated by the API context and by the fact that most old pipelines are sync, we bridge sync pipelines into the async streaming endpoint, but, talking with Michele, this creates similar problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ignore-for-release-notes PRs with this flag won't be included in the release notes. topic:core topic:tests type:documentation Improvements on the docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants