Skip to content

Guidance/plans to add encoder/decoder model support for e.g T5 model? #10

@mustavikhan05

Description

@mustavikhan05

I'm trying to adapt the bitdistiller code for encoder-decoder models.

Are there any plans to add support for this? Can some guidance be provided what parts need adaptation?

We're running a project to test the findings found in Table 5 where Llama 7B performed better as the teacher than 13B. We're testing the hypothesis you put forward across OPT models and now expanding our experiment to encoder-decoder models. Further, we're also running an experiment to sequentially introduce larger teachers. I.E self-distillation followed by a bigger model as teacher on the self-distilled model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions