How was this model trained?

by BramVanroy - opened Nov 8, 2022

Nov 8, 2022

I'd love to play around with a smaller version of mbart locally for debugging, so this tiny mbart sounds promising! Can you give more details about how this was trained/distilled? Data used, hyperparameters, etc.

Thanks!

syousefi

May 18, 2023

•

edited May 18, 2023

at least looking at the demo, it doesn't seem promising

BramVanroy

May 21, 2023

I think I read somewhere that the model was just randomly initialized and not trained at all, but I do not remember whether this occurred to me in a dream or real life.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment