Skip to content

Possible improvement to dynamic dispatch (breaking) #1295

@AntoinePrv

Description

@AntoinePrv

Right now this is the code the dispatcher runs on every call:

template <class... Tys>
XSIMD_INLINE auto operator()(Tys&&... args) noexcept
{
return walk_archs(ArchList {}, std::forward<Tys>(args)...);
}

There is some recursive template machinery, coupled with runtime checks on supported_arch::has().
There is potentially quite a few steps for the compiler to prove that it can replace that with a direct jump (to be further investigated).

In Apache Arrow, we use an internal dispatch mechanism where we evaluate available architecture once and store the function pointer (during a dispatcher static initialization).
Here however, we do not have a unique type for all candidate function because the first parameter, the arch, are distinct struct.

if (availables_archs.has(Arch {}))
return functor(Arch {}, std::forward<Tys>(args)...);

Example from the doc:

struct sum
{
    template <class Arch, class T>
    T operator()(Arch, T const* data, unsigned size);
};

Now the proposed breaking change would be to take a free function templated by arch, but not using it as a parameter.
I am not immediately sure how we'd handle T but I think we can manage.

  template <class Arch>
  T sum1(float const* data, unsigned size);

Or perhaps with a static method of a functor.

 template <class Arch>
struct sum2
{
    template <class T>
    static T call(T const* data, unsigned size);
};

In both cases there is an underlying function pointer that can be directly stored (sum1<sse4_2>/sum1<avx2> or sum2<sse4_2>::call<float>/sum2<avx2>::call<float>).

@serge-sans-paille do you think it is worth investigating? I can try to find a bit more time to look at the current generated assembly.
CC @JohanMabille

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions