Accelerating SDXL 3x Faster with DeepCache and OneDiff

Make SDXL run 3.5x faster on RTX 3090 and 3x faster on A100.

2 min readDec 20, 2023

DeepCache was launched last week, which is called a novel training-free and almost lossless paradigm that accelerates diffusion models from the perspective of the model architecture.

Now OneDiff introduces a new ComfyUI node named ModuleDeepCacheSpeedup (which is a compiled DeepCache Module), enabling SDXL iteration speed 3.5x faster on RTX 3090 and 3x faster on A100.

Here is the example: https://github.com/Oneflow-Inc/onediff/pull/426

Run

ComfyUI node name：ModuleDeepCacheSpeedup
You can refer to this URL on using the node：https://github.com/Oneflow-Inc/onediff/tree/main/onediff_comfy_nodes#installation-guide

Example workflow

Depending

The latest main branch of OneDiff: https://github.com/Oneflow-Inc/onediff/tree/main
The latest OneFlow community edition:

cuda 11.8:

python3 -m pip install --pre oneflow -f 
https://oneflow-pro.oss-cn-beijing.aliyuncs.com/branch/community/cu118

cuda12.1:

python3 -m pip install --pre oneflow -f
https://oneflow-pro.oss-cn-beijing.aliyuncs.com/branch/community/cu121

cuda12.2:

python3 -m pip install --pre oneflow -f
https://oneflow-pro.oss-cn-beijing.aliyuncs.com/branch/community/cu122

Welcome to join OneDiff Discord group to discuss related questions.

Accelerating SDXL 3x Faster with DeepCache and OneDiff

Make SDXL run 3.5x faster on RTX 3090 and 3x faster on A100.

Written by OneFlow

No responses yet