Skip to content

invalid purls should not be parsed #229

@armijnhemel

Description

@armijnhemel

The current packageurl package allows parsing of purls that are not valid according to the specification. Let's take this purl:

pkg:sid/@busybox.org/busybox@1.35.0

This is parsed as:

>>> import packageurl
>>> purl = 'pkg:sid/@busybox.org/busybox@1.35.0'
>>> packageurl.PackageURL.from_string(purl)
PackageURL(type='sid', namespace='@busybox.org', name='busybox', version='1.35.0', qualifiers={}, subpath=None)

which is invalid. According to section 5.2 of the ECMA standard the @ is a so called "separator character":

the Separator Characters :/@?=&# (colon ':', slash '/', at sign '@', question mark '?', equal sign '=', ampersand '&' and hash sign '#')

Section 5.3 then specifies:

'@' (at sign) is the separator between name and version

and it cannot appear anywhere else. This means that @ can never appear before the name part of a purl and reporting @busybox.org for the namespace is wrong.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions