Coupled Discrete Diffusion (CoDD)

This is the official demo for the paper Breaking the Factorization Barrier in Diffusion Language Models.

Motivation and Intuition of CoDD. Left: Illustration of the misspecification gap. The plot reports the perplexity of LLaDA on the MathInstruct validation set across varying mask ratios. Curve (a) Sequential generation represents the ideal baseline (i.e., the true joint distribution learned by the model). When restricted to (b) One-step generation, the independence assumption causes significant performance degradation. The shaded region highlights this loss of perplexity, defined as the misspecification gap $\mathcal{L}_{\mathrm{gap}}$. (c) CoDD significantly bridges this gap while retaining the efficiency of one-step prediction. Right: Conceptual comparison on "He is from <MASK> <MASK>". (a) Sequential generation accurately resolves dependencies but sacrifices speed. (b) One-step generation predicts in parallel but assumes independence, leading to incoherent mixtures like "San York". (c) CoDD overcomes this by modulating predictions with a tractable probabilistic inference layer, recovering valid joint distributions (e.g., "San Diego") in a single parallel step.

Quick Start

1. Install Dependencies

conda env create -f environment.yml
conda activate codd

2. Set Up lm-evaluation-harness

Our evaluation uses a customized lm-evaluation-harness. Add it to your Python path:

export PYTHONPATH="${PYTHONPATH}:$(pwd)/lm-evaluation-harness"
cd lm-evaluation-harness
pip install -e .
pip install math_verify
cd ..

To make this permanent, consider adding the above line to your ~/.bashrc or ~/.zshrc.

3. Run the Example

python example.py

This script compares on a simple example:

Base LLaDA-8B-Instruct: Standard block diffusion generation
CoDD: Copula-guided block diffusion generation

Pre-trained PC Models

We provide domain-specific Probabilistic Circuit (PC) guidance models for both LLaDA and Dream architectures. Use these with the --pc_ckpt argument to enable Copula-guided generation.

Base Model	Domain / Task	Checkpoint ID
LLaDA-8B	Mathematical Reasoning	`il18/llada-math-pc`
	Grade School Math	`il18/llada-gsm-pc`
	Code Generation	`il18/llada-code-pc`
Dream-7B	Mathematical Reasoning	`il18/dream-math-pc`
	Grade School Math	`il18/dream-gsm-pc`
	Code Generation	`il18/dream-code-pc`

Benchmark Evaluation

Use ./eval/eval.sh to run evaluations on benchmarks (GSM8K, MATH500, MBPP, GPQA).

Basic Usage

cd eval
./eval.sh --gpus 0 \
    --run '--model_alias llada --llada_ckpt GSAI-ML/LLaDA-8B-Instruct --task math500 --alg low_confidence --num_steps 256'

With CoDD (PC Guidance)

./eval.sh --gpus 0 \
    --run '--model_alias llada --llada_ckpt GSAI-ML/LLaDA-8B-Instruct --task math500 --alg low_confidence --num_steps 256 --pc_ckpt il18/llada-math-pc --pc_temperature 0.2 --pc_frac 0.5'

eval.sh Options

Option	Description
`--gpus`	Comma-separated GPU IDs (e.g., `0,1,2`)
`--run`	Arguments for a single evaluation run
`--output_dir`	Directory for results (default: `results`)
`--tag`	Optional tag for log files

eval.py Arguments

Argument	Description
`--model_alias`	Model type: `llada` or `dream`
`--llada_ckpt`	LLaDA checkpoint path or HuggingFace repo
`--dream_ckpt`	Dream checkpoint path or HuggingFace repo
`--task`	Benchmark: `gsm8k`, `math500`, `mbpp`, `gpqa`
`--alg`	Remasking algorithm: `low_confidence`, `random`, `entropy`, `margin`, `topprob`
`--num_steps`	Number of diffusion steps
`--pc_ckpt`	Path or HuggingFace repo for PC model
`--pc_temperature`	PC guidance temperature (default: 0.7)
`--pc_frac`	Fraction of steps using PC guidance (default: 0.3)
`--block_length`	Block length for semi-autoregressive generation (default: 32)

Multi-Run Example

Run multiple evaluations in parallel across GPUs:

./eval.sh --gpus 0,1\
    --run '--model_alias llada --llada_ckpt GSAI-ML/LLaDA-8B-Instruct --task gpqa --alg low_confidence --num_steps 256' \
    --run '--model_alias llada --llada_ckpt GSAI-ML/LLaDA-8B-Instruct --task math500 --alg low_confidence --num_steps 256' \

Logs are saved to eval/results/logs/.

Citation

@misc{li2026breakingfactorizationbarrierdiffusion,
      title={Breaking the Factorization Barrier in Diffusion Language Models}, 
      author={Ian Li and Zilei Shao and Benjie Wang and Rose Yu and Guy Van den Broeck and Anji Liu},
      year={2026},
      eprint={2603.00045},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2603.00045}, 
}

Acknowledgements

The evaluation scripts in this repository are adapted from APD, building upon the EleutherAI lm-evaluation-harness.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coupled Discrete Diffusion (CoDD)

Quick Start

1. Install Dependencies

2. Set Up lm-evaluation-harness

3. Run the Example

Pre-trained PC Models

Benchmark Evaluation

Basic Usage

With CoDD (PC Guidance)

eval.sh Options

eval.py Arguments

Multi-Run Example

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
codd		codd
eval		eval
figs		figs
lm-evaluation-harness		lm-evaluation-harness
paper		paper
README.md		README.md
environment.yml		environment.yml
example.py		example.py
torch290.yml		torch290.yml

Folders and files

Latest commit

History

Repository files navigation

Coupled Discrete Diffusion (CoDD)

Quick Start

1. Install Dependencies

2. Set Up lm-evaluation-harness

3. Run the Example

Pre-trained PC Models

Benchmark Evaluation

Basic Usage

With CoDD (PC Guidance)

eval.sh Options

eval.py Arguments

Multi-Run Example

Citation

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages