fix(tools): collapse intra-file duplicate typedefs in base header#2149
fix(tools): collapse intra-file duplicate typedefs in base header#2149lloeki wants to merge 1 commit into
Conversation
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
|
🔒 Cargo Deny Results📦
|
dedup_headers only removed definitions from child headers that were
byte-identical to ones already present in the base header. It never
deduplicated definitions within the base header itself, so cbindgen
output that emits the same profiling type from two crate boundaries
left duplicate typedefs in the merged common.h.
Two cases survived and broke consumers compiling with
-Werror -Wtypedef-redefinition (C11):
1. A bare forward declaration "typedef struct X X;" coexisting with the
full-body definition "typedef struct X { ... } X;" (e.g.
ddog_prof_EncodedProfile, ddog_prof_StringId, OpaqueStringId).
2. An identical pointer typedef emitted twice whose doc comments differ,
so the existing exact-string dedup kept both (ddog_prof_StringId2,
ddog_prof_MappingId2, ddog_prof_FunctionId2).
Add a final pass over the assembled base header that drops a forward
struct/union/enum declaration when a full-body definition of the same
name exists, and removes later duplicates of an identical typedef
statement regardless of differing comments. This makes the generated
common.h clean by construction and removes the need for downstream
post-processing workarounds.
ab502c6 to
77f60a7
Compare
| match s.find("*/") { | ||
| Some(end) => s[end + 2..].trim_start(), | ||
| None => s, |
There was a problem hiding this comment.
| match s.find("*/") { | |
| Some(end) => s[end + 2..].trim_start(), | |
| None => s, | |
| s.find("*/").map(|end| s[end + 2..].trim_start()).unwrap_or(s); |
| } | ||
|
|
||
| fn is_ident(s: &str) -> bool { | ||
| !s.is_empty() && s.bytes().all(|b| b.is_ascii_alphanumeric() || b == b'_') |
There was a problem hiding this comment.
For a token to be a valid C identifier, the first character must NOT be a digit, which isn't included in this test. I don't remember in C but in Rust for example this could match 0usize which isn't an identifier.
| .filter_map(|d| bodied_typedef_name(statement_text(d.str))) | ||
| .collect(); | ||
|
|
||
| let mut seen: HashSet<&str> = HashSet::new(); |
There was a problem hiding this comment.
It's not entirely clear if what seen includes going forward. I would suggest to give it a more explicit name (I suppose it's the typedefs that are not bodied?)
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
What
Enhance the
dedup_headersdev tool so the generated/bundledinclude/datadog/common.hno longer contains duplicate typedefs.Why
dedup_headersonly removed definitions from child headers that werebyte-identical to ones already present in the base header (
common.h).It never deduplicated definitions within the base header. When cbindgen
emits the same profiling type from two crate boundaries (e.g. via
after_includesforward declarations inlibdd-profiling-ffi/cbindgen.tomlplus the regular body definition), the merged
common.hends up withduplicate typedefs.
These are fatal for consumers compiling with
-Werror -Wtypedef-redefinition(C11). Two distinct classes were observed in the v36.0.0 artifacts:
typedef struct X X;forwarddeclaration coexisting with the full
typedef struct X { ... } X;:ddog_prof_EncodedProfile,ddog_prof_StringId,OpaqueStringId.one carries a doc comment (so the existing exact-string dedup keeps both):
ddog_prof_StringId2,ddog_prof_MappingId2,ddog_prof_FunctionId2.How
Add a final pass (
dedup_base_typedefs) over the assembled base header that:typedef struct/union/enum X X;when a full-bodydefinition of the same name
Xexists elsewhere in the file (keeping thebody, regardless of ordering), and
statement text with any leading doc comment stripped.
Opaque forward declarations (no body) and genuine aliases
(
typedef struct A B;withA != B) are preserved.This makes
common.hclean by construction and obsoletes downstreampost-processing workarounds (e.g. the one in
libdatadog-rb).Validation
Headers were generated via the FFI crates' cbindgen build scripts and run
through
dedup_headersexactly asbuilderinvokes it.Before — each of the six types appears twice; clang fails:
After — each type appears exactly once; clang passes (exit 0):
cargo test -p tools --lib— 20 passed (5 new tests for the dedup pass)cargo clippy -p tools --all-targets --all-features -- -D warnings— cleancargo fmt -p tools -- --check— cleanNote
One of three coordinated changes for the libdatadog v36 duplicate-typedef header issue (increasing order of permanence):
-Wno-error=typedef-redefinitionstopgap.36.0.0.1.1, without waiting for a libdatadog release.dedup_headers: makescommon.hclean by construction and obsoletes the libdatadog-rb post-processing once released.