Why Are Small AI Models Dumb? A Solution to 'Embedding Condensation'
An explanation of 'Dispersion Loss,' a new training method that improves the performance of small language models, and the phenomenon of embedding condensation.
An explanation of 'Dispersion Loss,' a new training method that improves the performance of small language models, and the phenomenon of embedding condensation.