Skip to content

Syclbin unify with llvm object#22089

Draft
koparasy wants to merge 11 commits into
intel:syclfrom
koparasy:syclbin-unify-with-llvm-object
Draft

Syclbin unify with llvm object#22089
koparasy wants to merge 11 commits into
intel:syclfrom
koparasy:syclbin-unify-with-llvm-object

Conversation

@koparasy
Copy link
Copy Markdown
Contributor

@koparasy koparasy commented May 22, 2026

Draft PR generated by claude to finalize #20752 .

This is primarily a PR for me to see implementation. I plan to split this in multiple PRs once I feel I have the entire implementation aligned.

Locally all tests pass.

Points for discussion:

  • syclbin.cpp references symbols from LLVMObject/LLVMSupport. Pre-existing CMake gate in (sycl/source/CMakeLists.txt) only links those LLVM libs into the SYCL runtime flavour whose MSVC C runtime (CRT, e.g. /MD vs /MDd) matches the CRT used to build LLVM. Other flavour skips the link to avoid mixing CRTs in a single DLL (different heap, different errno, ABI hazard). To fix this, we mirror the existing SYCL_ENABLE_STACK_PRINTING / LLVMSupport pattern in the same file: define a compile-time macro (SYCL_RT_HAS_LLVMOBJECT) only when the CRT-match gate passes and the LLVM libs are actually linked. syclbin.cpp is gated on that macro: when defined, full SYCLBIN parsing via llvm::object::SYCLBIN; when undefined, throwing-stub implementations of SYCLBINBinaries's public methods that raise errc::feature_not_supported. Effect: every other SYCL feature (kernels, queues, USM, source kernel_compiler via sycl-jit, AOT device-image loading) keeps working in the mismatched-CRT flavour. Only the SYCLBIN-specific kernel_bundle constructors and ext_oneapi_get_content throw, with a clear message explaining that LLVM SYCLBIN reader was not linked into this runtime flavour. Same approach already used by kernel_compiler_sycl.cpp when sycl-jit is disabled.

koparasy and others added 11 commits May 21, 2026 16:26
Removes the in-tree copy of llvm::object::SYCLBIN and the
PropertySetRegistry duplicate from the SYCL runtime, replacing them with
direct use of the canonical implementations in LLVMObject and
LLVMSupport. Eliminates the format-drift class of bug already exemplified
by #34774 (runtime parser had to be patched independently to follow an
OffloadBinary v1 -> v2 bump in LLVM).

The SYCL runtime's libsycl now links LLVMObject (and transitively
LLVMSupport / LLVMBinaryFormat / LLVMTargetParser). To keep LLVM headers
out of runtime headers and avoid leaking LLVM types into
kernel_bundle_impl.hpp and friends, SYCLBINBinaries is rewritten as a
PIMPL: the public interface uses only sycl-native types, and all
llvm::object::SYCLBIN / llvm::util::PropertySetRegistry state lives in
syclbin.cpp behind an opaque Impl.

The on-disk SYCLBIN format is unchanged. -fvisibility=hidden plus the
existing ld-version-script keep all LLVM symbols internal to libsycl.

Local libsycl.so size delta: ~+416 KB (5.55 MB -> 5.97 MB).

Tests: check-sycl-unittests, check-sycl, SYCLBIN lit tests all pass.
Replaces the SYCLBIN-private inner header tables (FileHeaderType,
AbstractModuleHeaderType, IRModuleHeaderType,
NativeDeviceCodeImageHeaderType) with direct use of multi-entry
OffloadBinary as the on-disk container. The outer file is now a single
OffloadBinary; abstract modules, IR modules and native device code
images are each represented by their own Entry, and abstract-module
grouping is encoded via the `am_index` StringData key on each Entry.

Per-Entry image bytes:
  global_metadata, am_metadata: serialized PropertySetRegistry
  ir, native:                    [u64 LE metadata_size]
                                 [serialized PropertySetRegistry]
                                 [raw IR / native bytes]

Format version is bumped 1 -> 2. The legacy v1 layout (SYBI-magic
inner header tables) is no longer produced, but SYCLBIN::read still
accepts it: the dispatcher peeks the first OffloadBinary entry's image
bytes for the SYBI magic and routes to the retained readV1 path. A
backward-compat unit test (checkLegacyV1ReaderBackwardCompat) builds a
minimal v1 buffer programmatically and exercises this path end-to-end.

Touch points:
  * llvm/include/llvm/Object/SYCLBIN.h: shape of SYCLBINDesc::ImageDesc
    becomes (IRType, ArchString, TargetTripleStr, FilePath); separate
    readV1/readV2 helpers; CurrentVersion = 2; LegacyMagicNumber
    retained for v1 detection.
  * llvm/lib/Object/SYCLBIN.cpp: rewritten write() emits a single
    multi-entry OffloadBinary; new readV2() walks entries by role and
    am_index; readV1() preserves prior logic verbatim; dispatcher in
    read() handles bare-SYBI, v1-in-OffloadBinary and v2 cases.
  * clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:
    packageSYCLBIN no longer wraps the SYCLBIN::write output in an
    extra OffloadBinary envelope -- write() now produces the complete
    on-disk file.
  * sycl/source/detail/syclbin.cpp: SYCLBINBinaries::Impl drops its
    manual OffloadBinary unwrap; SYCLBIN::read accepts the whole file
    directly.
  * sycl/tools/syclbin-dump/syclbin-dump.cpp: same simplification.
  * sycl/test/syclbin/input_files.cpp: relax error-message assertion
    after the dispatch path through OffloadBinary changed.
  * llvm/unittests/Object/SYCLBINTest.cpp: drop dependency on the
    removed getSYCLBINByteSize() helper; add v1 backward-compat test.
  * sycl/doc/design/SYCLBINDesign.md: rewrite the format section to
    describe the v2 OffloadBinary-based layout; document the v1
    backward-compat reader and add the v2 changelog entry.

Tests: ObjectTests SYCLBINTest 6/6 pass (5 v2 round-trip + 1 v1
backward-compat); check-sycl 2855/2855 pass; sycl/test/syclbin lit 2/2
pass; libsycl size delta vs pre-commit-1 baseline: ~+422 KB.
…llvm-object' into syclbin-unify-with-llvm-object
The SYCLBINBinaries adapter used to set the C-ABI
_sycl_device_binary_property{,set}_struct::Name fields to the raw
data() pointer of the corresponding key in the parsed
llvm::util::PropertySetRegistry. Those keys are stored as
llvm::SmallString<16>, whose data() is not guaranteed to be
null-terminated.

Downstream code (RTDeviceBinaryImage::PropertyRange::init,
DeviceBinaryProperty, ext_oneapi_has_kernel, ...) calls strcmp on
those names, which reads past the SmallString's last byte. As long
as the bytes immediately following the SmallString happened to
contain a NUL, strcmp returned the right answer; once the
allocator reused freshly-freed memory containing tail bytes from
another property name, strcmp tripped over them and reported the
property set as "not found", leaving SYCLBIN-derived
device_image_impl::MKernelNames empty and ext_oneapi_has_kernel
returning false.

Symptom: e2e tests under sycl/test-e2e/SYCLBIN/* assert when run
after a CommonLoadCheck-style sequence of failed loads, because
the second SYCLBIN parse re-uses the heap region of the first one.

Fix: own the property and property-set names in a std::deque<std::string>
on the adapter, and point the C-ABI Name fields at the
null-terminated c_str(). std::deque keeps element pointers stable
as more strings are appended for subsequent abstract modules.

Local repro on Intel PVC:
- before: 17/24 SYCLBIN/ e2e tests fail.
- after:  20/24 pass, 4 unsupported (no failures).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round of fixes addressing review feedback on the v2 OffloadBinary-based
SYCLBIN format. No functional change for valid v2 / v1 files; the
hardening targets adversarial / future / tooling scenarios.

Reader (llvm/lib/Object/SYCLBIN.cpp):
  * Cap the parsed `am_index` to the number of OffloadBinary entries.
    Prevents `Result->AbstractModules.resize(am_index + 1)` from
    triggering an OOM-sized allocation on attacker-supplied files.
  * Reject negative / non-decimal `am_index` strings before
    StringRef::getAsInteger, so a leading '-' can no longer wrap to
    UINT64_MAX.
  * Defensive arithmetic in splitImagePayload: write the bound check as
    `sizeof(uint64_t) + MetadataSize > Image.size()` plus an overflow
    guard, instead of the previous `MetadataSize > Image.size() - 8`.
  * Validate the writer-stamped `sycl_format_version` StringData on the
    `global_metadata` entry. Mismatch fails read with a clear message.

Writer (llvm/lib/Object/SYCLBIN.cpp):
  * Stamp `sycl_format_version="2"` on every entry's StringData so
    future readers can refuse / dispatch on a wrong format version
    without having to introspect the per-image PropertySetRegistry blob.
  * Single-pass build of the OffloadingImage list (drops the previous
    nullptr-then-patch dance). Per-entry StringData strings live in a
    BumpPtrAllocator + StringSaver instead of a per-entry std::string
    holder.
  * `serializeImageMetadata` now takes primitive (IRType, ArchString,
    TargetTripleStr, Category) so the helper doesn't need access to the
    private SYCLBINDesc::ImageDesc type.

Header (llvm/include/llvm/Object/SYCLBIN.h):
  * Document the lifetime contract on SYCLBIN::read: the caller's
    MemoryBuffer must outlive the returned SYCLBIN because RawIRBytes /
    RawDeviceCodeImageBytes are StringRefs into it.
  * Document the *intentional* duplication of triple/arch/ir_type in
    OffloadBinary StringData and per-image PropertySetRegistry, with
    the explicit invariant that the reader uses the PropertySetRegistry
    copy as canonical. The StringData copy exists so that generic
    offload tooling (`llvm-objdump --offloading`) can show triple/arch
    columns for SYCLBIN entries without cracking open the SYCL property
    blob, and lets the two surfaces evolve independently.
  * Delete dead OwnedStorage member -- it was declared but never
    assigned.

Runtime adapter (sycl/source/detail/syclbin.cpp):
  * Zero-init each _sycl_device_binary_property_struct on emplace_back.
    The UINT32 path only writes the lower 4 bytes of the uint64_t
    ValSize field, so without value-init the upper 4 bytes were stack /
    freelist garbage. asUint32 reads only the lower 4 bytes so the
    practical impact was zero, but MSan / Valgrind would flag it and
    any future refactor reading the full 64 bits would be wrong.

Tests:
  * llvm/unittests/Object/SYCLBINTest.cpp: seven new negative tests
    exercising the trash-buffer, missing-global, unknown-role,
    out-of-range / negative am_index, truncated ir payload, and bogus
    sycl_format_version paths.
  * llvm/unittests/Object/SYCLBINTest.cpp: V1RealFixtureReads loads a
    real .syclbin produced by the toolchain at the commit immediately
    before the v2 rewrite (intel/sycl @ 2d41460). Belt-and-braces
    against my hand-crafted programmatic v1 buffer missing some quirk
    of real-toolchain output.
  * llvm/unittests/Object/Inputs/syclbin_v1_legacy.syclbin: that
    fixture, 1.7 KB.
  * sycl/test/syclbin/llvm_offload_binary_introspection.cpp: lit test
    that runs `llvm-objdump --offloading` over a v2 .syclbin and
    confirms the per-IR-entry triple is surfaced via OffloadBinary
    StringData. Locks in the duplication-justifies-itself behaviour
    so that future cleanups don't accidentally regress the
    generic-tooling story.

ObjectTests SYCLBINTest now: 14/14 pass (5 v2 round-trip + 1 v1 hand-built
+ 1 v1 real-fixture + 7 negative paths). Local sycl/test/syclbin lit:
3/3 pass (was 2/2 before the new lit test). Local PVC e2e SYCLBIN/*:
20/24 pass, 4 unsupported, 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant