Would you accept an optional license alias resolver in LicenseFactory.make_from_string(), or do you consider this out of scope for the lib (and rather see this for the cyclonedx-python scanner?
Context
When LicenseFactory.make_from_string() is called with a string that is neither a valid SPDX expression nor a recognized SPDX ID (e.g., "Modified BSD License", "new BSD", "3-Clause BSD"), it falls back to creating a DisjunctiveLicense(name=...) without an SPDX id.
In practice, this is very common when consuming Python package metadata (legacy License field, PEP 639 string license, ambiguous trove classifiers). Downstream consumers like cyclonedx-python end up emitting SBOMs with name-only licenses instead of canonical SPDX id-based licenses, which reduces SBOM quality for license compliance tooling.
Possible solution
Integrate LicenseLynx (BSD-3-Clause, Python >=3.9) as an optional alias resolver. LicenseLynx provides a deterministic mapping from 10,000+ license aliases (SPDX, OSI, ScanCode LicenseDB, community contributions) to canonical SPDX identifiers.
Sketch:
result = LicenseLynx.map("Modified BSD License")
# result.id == "BSD-3-Clause", result.src == "custom"
Inside make_from_string(), before the name-only fallback:
- Try SPDX expression parsing (current behavior)
- Try single SPDX ID match (current behavior)
- NEW: Try LicenseLynx alias resolution; if it yields a canonical SPDX ID, return a proper
DisjunctiveLicense(id=...)
- Fall back to
DisjunctiveLicense(name=...) (current behavior)
I would also volunteer to contribute this feature request. If you prefer this feature in the cyclonedx-python scanner, you can either close this issue with and I open it up there or transfer this issue and rewrite it to the feature template.
Would you accept an optional license alias resolver in
LicenseFactory.make_from_string(), or do you consider this out of scope for the lib (and rather see this for thecyclonedx-pythonscanner?Context
When
LicenseFactory.make_from_string()is called with a string that is neither a valid SPDX expression nor a recognized SPDX ID (e.g., "Modified BSD License", "new BSD", "3-Clause BSD"), it falls back to creating aDisjunctiveLicense(name=...)without an SPDXid.In practice, this is very common when consuming Python package metadata (legacy
Licensefield, PEP 639 string license, ambiguous trove classifiers). Downstream consumers likecyclonedx-pythonend up emitting SBOMs with name-only licenses instead of canonical SPDX id-based licenses, which reduces SBOM quality for license compliance tooling.Possible solution
Integrate LicenseLynx (BSD-3-Clause, Python >=3.9) as an optional alias resolver. LicenseLynx provides a deterministic mapping from 10,000+ license aliases (SPDX, OSI, ScanCode LicenseDB, community contributions) to canonical SPDX identifiers.
Sketch:
Inside
make_from_string(), before the name-only fallback:DisjunctiveLicense(id=...)DisjunctiveLicense(name=...)(current behavior)I would also volunteer to contribute this feature request. If you prefer this feature in the
cyclonedx-pythonscanner, you can either close this issue with and I open it up there or transfer this issue and rewrite it to the feature template.