Skip to content

Releases: lean-dojo/TorchLean

v1.2: Cleaner API, Lean 4.31, and a Stronger Runtime Stack

18 Jun 02:03

Choose a tag to compare

This release is a big cleanup pass and a real step toward the TorchLean I want people to actually use: cleaner imports, clearer examples, better runtime behavior, and a much more honest story about what is proved in Lean versus what is delegated to native libraries.

The biggest change is the API. We spent a lot of time cutting away old internal-looking surfaces and making the public path feel simpler. The intended workflow is now much closer to:

import NN
open TorchLean

Define a model, load data, train it, evaluate it, and only drop into lower-level modules when you actually need them.

What changed

  • Upgraded the project to Lean 4.31.
  • Refactored the public API so examples are cleaner and less tied to old internal modules.
  • Reworked training examples across MLP, KAN, GPT-2, ViT, FNO, sequence models, vision models, and supervised models.
  • Added better model summaries instead of hardcoded architecture strings in examples.
  • Made CUDA fast kernels opt-in instead of silently turning them on.
  • Added an optional PyTorch ATen/libtorch bridge prototype, so TorchLean can start moving ordinary runtime kernels toward mature PyTorch-backed implementations.
  • Replaced the CROWN Lyapunov oracle axiom with an explicit witness object.
  • Cleaned up the trust-boundary documentation so it is clearer what Lean proves and what native code or external tools provide.
  • Expanded runtime and CUDA smoke testing around the refactored codebase.
  • Updated docs, examples, and site material to match the new API direction.

Why this matters

TorchLean is not trying to be “just another neural-network library in Lean.” The point is to connect practical ML code with formal specifications and proofs. But for that to work, the practical side has to feel usable too.

This release moves in that direction. The examples are less cluttered. The runtime flags are more explicit. The verification assumptions are easier to audit. And the project now has a clearer path for using PyTorch/ATen kernels where that makes more sense than maintaining our own CUDA code.

CUDA and PyTorch kernels

One important cleanup: custom CUDA kernels are no longer treated as the default answer for everything. They are useful for deterministic kernels, proof-carrying experiments, and special runtime paths, but ordinary tensor execution should increasingly go through mature external libraries.

That is why this release includes the first optional ATen/libtorch bridge. It is small for now, but it proves the path: Lean can call into PyTorch’s C++ runtime, and future work can route more standard ops through that backend.

Verification cleanup

We also tightened the proof story. The CROWN Lyapunov oracle is no longer a global axiom; it is now an explicit witness that carries the assumptions used by the theorem. That makes the trust boundary much easier to see.

The remaining native/runtime assumptions are documented directly in TRUST_BOUNDARIES.md, including CUDA buffers, native kernels, PyTorch/ATen calls, and external certificate producers.

Looking ahead

The next step is to keep pushing the runtime toward mature backend libraries, keep simplifying the public API, and make the verification examples stronger without making the user experience worse.

This release was a lot of cleanup work, but it makes the project feel much more solid. TorchLean is getting closer to the shape it should have as like initially but all the same strong proof features of Lean: )!

TorchLean v1.1: Lean 4.30 and more stable CUDA training

28 May 04:40

Choose a tag to compare

TorchLean v1.1 is a systems cleanup release.

The main change is that TorchLean now builds on Lean 4.30, and the eager CUDA training path is much better behaved on longer runs. We found that some CUDA-side intermediate buffers could stay alive longer than needed during training. That showed up as allocator pressure on longer GPT/MLP runs, even when the model itself was not large enough to justify the memory growth.

This release makes CUDA ownership more explicit: tape values, gradient buffers, and operation workspace buffers are released when the training step no longer needs them. We also added runtime-side initialization paths for larger modules, so CPU/CUDA model construction does less unnecessary host-side tensor work:)!

What changed:

  • Updated TorchLean to leanprover/lean4:v4.30.0.
  • Improved eager CUDA/autograd buffer ownership for long training runs.
  • Added CUDA allocator diagnostics through --cuda-mem-watch.
  • Added runtime initialization support for larger CPU/CUDA modules.
  • Cleaned up and standardized public model training commands.
  • Updated the README, website updates page, examples page, and blueprint guide.
  • Regenerated the dependency graph and website metadata.

TorchLean v1

11 May 04:34

Choose a tag to compare

Excited to release TorchLean!

This release is pinned to Lean v4.29.0.

Highlights

  • Typed tensor and model APIs through the public NN.API surface.
  • Runnable model examples through lake exe torchlean, including MLP, CNN, sequence/text models, generative examples, FNO, and RL workflows.
  • Verification CLI workflows through lake exe verify, including TorchLean-to-IR and IBP-style examples.
  • Shared graph infrastructure through NN.IR and NN.GraphSpec.
  • Runtime/autograd infrastructure, optimizers, training loops, logging, and optional CUDA-backed execution.
  • Finite-precision and IEEE-style executable semantics under NN/Floats.
  • Proof layers for tensor algebra, autograd correctness, runtime approximation, learning theory, CROWN/LiRPA-style robustness, and related ML theory.
  • BugZoo examples turning common ML failure modes into small checked contracts.
  • Project website, guide, and API docs sources included in the repository.