[SYCL][Graph] Allow capturing restricted host tasks in native recording mode#22143
[SYCL][Graph] Allow capturing restricted host tasks in native recording mode#22143adamfidel wants to merge 10 commits into
Conversation
25ba6de to
1e41cd9
Compare
…d commands Move EnqueueHostTaskData and NativeHostTask from the anonymous namespace in commands.cpp into host_task.hpp so both the native recording path in handler.cpp and the scheduler path in commands.cpp use the same type and callback rather than duplicating the pattern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Do we need to check if the enqueue host task result is a success code? If it fails for any reason, then I think this could cause a leak because the host task will never free the allocation.
There was a problem hiding this comment.
It looks like the allocation gets destroyed after the user's host task completes. What happens with multiple replays of the graph? Do we need some mechanism to keep HostTaskData alive until the graph is destroyed?
There was a problem hiding this comment.
Does this API work with command buffer partitions?
If we want it to not partition the graph, we need to add urCommandBufferAppendHostTaskExp + SYCL logic, but I'm curious if it gets handled correctly with a graph partition.
| auto CallbackData = std::make_unique<detail::EnqueueHostTaskData>( | ||
| detail::HandlerAccess::getHostTaskFunc(*HT->MHostTask)); | ||
| // Store callback in the graph so it is available during replays | ||
| auto GraphImpl = Queue->getNativeRecordingGraph(); |
There was a problem hiding this comment.
I think we need to consider the case where the queue is being recorded due to a fork in the graph. In this case, MNativeRecordingGraph should be empty as it was not the one begin_recording was called on.
I think caching the native graph is still viable, but we might need to combine it with UR get graph and the ur - > command_graph map we keep in the context. We would also need some way to clear these graph handles on end_recording for these non primary queues.
This PR allows SYCL Graph native recording mode to capture the new restricted host task defined in the
sycl_ext_oneapi_enqueue_functionsextension.