We introduce DreamLite, a compact and unified on-device diffusion model (0.39B) that seamlessly supports both text-to-image generation and text-guided image editing within a single network architecture.
Built upon a pruned mobile U-Net backbone, DreamLite unifies multimodal conditioning through In-Context Spatial Concatenation directly in the latent space. By leveraging progressive step distillation, DreamLite achieves ultra-fast 4-step inference, capable of generating or editing a 1024Γ1024 image in ~3 seconds on an iPhone 17 Pro (powered by 4-bit Qwen-VL and fp16 VAE+UNet) β operating fully on-device with zero cloud dependency.
- [2026.04] πππ We officially released the inference code.
- [2026.03] πππ DreamLite is publicly announced! Check out our project page and arXiv paper.
Experience real-time generation and editing on an iPhone 17 Pro. No internet connection or cloud processing required.
| Human Portrait & Style Transfer | Nature Landscape & Background Swap | Product & Object Replacement |
|---|---|---|
|
|
|
Note: If demos fail to render natively on GitHub, please visit our Project Page to watch the full demonstrations.
# Clone the repository
git clone https://github.com/ByteVisionLab/DreamLite.git
cd DreamLite
# Create and activate a conda environment
conda create -n dreamlite python=3.10 -y
conda activate dreamlite
# Install dependencies
pip install -r requirements.txtEnsure the model weights (DreamLite-base and DreamLite-mobile) are placed in the following directory structure:
DreamLite/
βββ models/
β βββ DreamLite-base/
β βββ DreamLite-mobile/
You can readily generate or edit images utilizing our provided command-line interfaces.
# ==========================================
# DreamLite-base: 28 Steps (High Fidelity)
# ==========================================
# Text-to-Image Generation
python infer.py --prompt "A close-up of a fire spitting dragon cinematic shot."
# Text-guided Image Editing
python infer.py --prompt "Transfer this image to oil-painting style." --image_path ./inputs/source.png
# ==========================================
# DreamLite-mobile: 4 Steps (Ultra Fast)
# ==========================================
# Text-to-Image Generation
python infer_mobile.py --prompt "A portrait of a young woman with flowers."
# Text-guided Image Editing
python infer_mobile.py --prompt "Change the background to a dense forest." --image_path ./inputs/source.pngWe provide comprehensive benchmark evaluation scripts (GenEval & ImgEdit) to facilitate performance comparisons between DreamLite and other state-of-the-art models. Please configure your local dataset paths within tools/benchmark/infer_geneval.py and tools/benchmark/infer_imgedit.py prior to execution.
# Run the benchmark evaluation
python tools/benchmark/infer_geneval.py --save_dir ./output/benchmark/geneval_output --geneval_json "YOUR_GENEVAL/evaluation_metadata.jsonl"
python tools/benchmark/infer_imgedit.py --save_dir ./output/benchmark/imgedit_output --json_path "YOUR_IMGEDIT_PATH/ImgEdit/Benchmark/Basic/basic_edit.json" --img_root "YOUR_IMGEDIT_IMAGES_PATH/ImgEdit/Benchmark/singleturn"We provide a user-friendly web interface powered by Gradio. You can try our live demo on Hugging Face Spaces, or deploy it locally on your own machine (GPU/CPU).
To run the interactive demo locally:
# Launch the local web server
python tools/app.pyWe offer two distinct variants of the DreamLite model to provide an optimal balance between visual fidelity and on-device inference latency.
Note
Model Access: Model weights are currently undergoing safety review. To request early access, please contact us at π§ klfeng1206@outlook.com with an email titled "DreamLite Access Request".
In your email, please ensure to include:
- Your Name & Affiliation (e.g., University, Company, or personal portfolio).
- Intended Use Case (Please briefly describe how you plan to use the DreamLite model).
| Model Variant | Params | Resolution | Steps | Guidance |
|---|---|---|---|---|
| DreamLite (Base) | 0.39B | 1024Γ1024 | 28 | CFG & IMG_CFG |
| DreamLite (Mobile) | 0.39B | 1024Γ1024 | 4 | No CFG |
Quantitative comparison with state-of-the-art methods on generation and editing benchmarks.
| Method | Params | GenEval β | DPG β | ImgEdit β | GEdit-EN-Q β |
|---|---|---|---|---|---|
| FLUX.1-Dev / Kontext | 12B | 0.67 | 84.0 | 3.76 | 6.79 |
| BAGEL | 7B | 0.82 | 85.1 | 3.42 | 7.20 |
| OmniGen2 | 4B | 0.80 | 83.6 | 3.44 | 6.79 |
| LongCat-Image / Edit | 6B | 0.87 | 86.6 | 4.49 | 7.55 |
| DeepGen1.0 | 2B | 0.83 | 84.6 | 4.03 | 7.54 |
| SANA-1.6B | 1.6B | 0.67 | 84.8 | - | - |
| SANA-0.6B | 0.6B | 0.64 | 83.6 | - | - |
| SnapGen++ (small) | 0.4B | 0.66 | 85.2 | - | - |
| VIBE | 1.6B | - | - | 3.85 | 7.28 |
| EditMGT | 0.96B | - | - | 2.89 | 6.33 |
| DreamLite (Ours) | 0.39B | 0.72 | 85.8 | 4.11 | 6.88 |
We provide comprehensive support for LoRA fine-tuning and inference, enabling lightweight customization of DreamLite on your own domain-specific datasets.
For detailed instructions, training scripts, and examples, please refer to our dedicated LoRA Fine-Tuning Guide.
- Release paper on arXiv
- Release inference code
- Release LoRA training
- Release model weights on HuggingFace
- Release online demo
- On-device Deployment Reference
We thank the great work from SDXL, SnapGen, Qwen and TAESDXL. The work is under supervision from Prof. Wangmeng Zuo.
Code: Apache-2.0
Model weights: see WEIGHTS_LICENSE, CC BY-NC 4.0
If our work assists your research, feel free to give us a star β or cite us using:
@article{feng2026dreamlite,
title={DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing},
author={Kailai Feng and Yuxiang Wei and Bo Chen and Yang Pan and Hu Ye and Songwei Liu and Chenqian Yan and Yuan Gao},
journal={arXiv preprint arXiv:2603.28713},
year={2026}
}





