ml-efficiency

Star

Here are 3 public repositories matching this topic...

mosaicml / composer

Star

Supercharge Your Model Training

machine-learning deep-learning neural-network pytorch neural-networks ml-training ml-systems ml-efficiency

Updated May 16, 2025
Python

stsxxx / MoDM

Star

MoDM is a cache-aware, hybrid serving system that accelerates image generation by dynamically combining small and large diffusion models for efficient, high-quality output.

diffusion-models serving-ml ml-efficiency

Updated May 5, 2025
Python

MyDarapy / SmolLM-experiments-with-grouped-query-attention

Star

(Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)

transformer attention smol huggingface ml-efficiency llm grouped-query-attention smol-lm huggingface-smol-lm

Updated Jan 11, 2025
Python

Improve this page

Add a description, image, and links to the ml-efficiency topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ml-efficiency topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ml-efficiency

Here are 3 public repositories matching this topic...

mosaicml / composer

stsxxx / MoDM

MyDarapy / SmolLM-experiments-with-grouped-query-attention

Improve this page

Add this topic to your repo