During inference, if I provide a sentence, it would provide top-k predictions based on the `beam_width` parameter. What this `num_translations_per_input` is referring to then? Are they the same parameter i.e beam width? If not what is this parameter (`num_translations_per_input`)?
During inference, if I provide a sentence, it would provide top-k predictions based on the
beam_widthparameter.What this
num_translations_per_inputis referring to then?Are they the same parameter i.e beam width? If not what is this parameter (
num_translations_per_input)?