Has anyone been successful with quantized vision transformers? #674

alexander-soare · 2021-06-01T11:35:03Z

alexander-soare
Jun 1, 2021

Today I managed to run FX quantization on vit_deit_base_distilled_patch16_384 with a few minor tweaks. I get a 2.5x speed up on CPU. Accuracy plummets though :(

Wondering if anyone has had experience with doing Quantization Aware Training on a vision transformer. Were you successful?

rwightman · 2021-06-01T15:43:25Z

rwightman
Jun 1, 2021
Maintainer

@alexander-soare I have not done this, but I'd pay particular attention to what's happening with the GELU activations during quantization, how do they get approximated? Also the LayerNorm mean/std, possible overflow? What's the precision of the accumulator? Despite the annoyances of BatchNorm, they are great for inference/quantization compared to GN, LN, etc that must always calc activations stats in the fwd pass.

1 reply

alexander-soare Jun 3, 2021
Author

Alright thanks for the leads! I'll have a look

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Has anyone been successful with quantized vision transformers? #674

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Has anyone been successful with quantized vision transformers? #674

alexander-soare Jun 1, 2021

Replies: 1 comment · 1 reply

rwightman Jun 1, 2021 Maintainer

alexander-soare Jun 3, 2021 Author

alexander-soare
Jun 1, 2021

Replies: 1 comment 1 reply

rwightman
Jun 1, 2021
Maintainer

alexander-soare Jun 3, 2021
Author