DoRA vs LoHA in Stable Diffusion: New Benchmarks Reveal Training Speed Trade-offs
A detailed community experiment compares DoRA and LoHA fine-tuning methods for Zimage-Turbo, revealing DoRA’s superior speed but potential quality trade-offs. The findings, shared by a Stable Diffusion enthusiast, offer actionable insights for AI artists and researchers optimizing model training.

DoRA vs LoHA in Stable Diffusion: New Benchmarks Reveal Training Speed Trade-offs
In a detailed community-driven experiment shared on Reddit’s r/StableDiffusion, a user known as u/TableFew3521 has unveiled compelling benchmarks comparing two popular low-rank adaptation techniques—DoRA (Decomposed Rank Adaptation) and LoHA (Low-Rank Hypernetworks)—for fine-tuning the Zimage-Turbo model. The results, based on 100 epochs of training using the OneTrainer platform, suggest that while DoRA delivers significantly faster convergence, LoHA—particularly when augmented with regularization and EMA—offers greater stability at the cost of computational time.
The test was conducted on an RTX 4060 Ti with 16GB VRAM and 64GB system RAM, using a modest dataset of 26 unique images and 17 regularization images. The training employed CAME (Cross-Attention Masked Encoding) and REX (Regularized Exposure) techniques with masked training to enhance concept retention. Both methods used a rank dimension of 32 and an alpha value of 12, ensuring comparability. The most striking finding: DoRA, trained on attention and MLP layers, completed 100 epochs in just 63 minutes, while LoHA, trained on block layers, required 82 minutes—nearly 30% longer. When LoHA was enhanced with regularization and EMA (Exponential Moving Average) and shifted to attention-MLP layers, training time ballooned to 137 minutes.
Training hyperparameters were carefully calibrated to avoid distortion. DoRA used a learning rate of 0.00006 with a batch size of 11, while LoHA operated at a much lower learning rate of 0.0000075 with the same batch size. The author noted that increasing LoHA’s learning rate led to degraded outputs, suggesting its sensitivity to optimization noise. Notably, the user applied a strength multiplier of 1.0 for DoRA and 2.0 for LoHA to compensate for perceived output intensity differences, a practical adjustment that underscores the methodological nuance required when comparing these techniques.
The author openly admitted to using an aggressive training strategy prioritizing speed over quality, acknowledging that artifacts and instability may arise. This pragmatic approach resonates with many AI content creators who need rapid iteration for prototyping or commercial workflows, even if final outputs require post-processing. The choice between DoRA and LoHA may thus hinge less on theoretical superiority and more on workflow demands: DoRA for speed and iterative testing, LoHA for fidelity when time permits.
While the experiment focused on Zimage-Turbo—a model known for its challenging training dynamics—the findings offer broader implications for the Stable Diffusion fine-tuning ecosystem. As low-rank adaptation methods continue to evolve, this real-world benchmark fills a critical gap between academic papers and practical deployment. Unlike theoretical comparisons, this data reflects actual hardware constraints, dataset sizes, and user-driven objectives.
Notably, the user emphasized that character-based training (like Zimage) remains more difficult than abstract concept training, suggesting that these results may not generalize to all use cases. Future research could explore hybrid approaches or adaptive learning rate scheduling to mitigate LoHA’s slower convergence without sacrificing stability.
For developers and artists alike, this experiment provides a rare, transparent look into the trade-offs of modern fine-tuning techniques. As AI model personalization becomes increasingly mainstream, such community-driven benchmarks are vital for guiding best practices beyond vendor marketing claims.
Source: Reddit user u/TableFew3521, r/StableDiffusion, post #1r6zg4c


