feat: save documentation info to SQLite database by david-christiansen · Pull Request #347 · leanprover/doc-gen4

david-christiansen · 2026-01-19T05:43:46Z

This PR adds a SQLite database between doc-gen4's per-module analysis and HTML generation. Lake
runs one single command per module (and genCore for the core Lean modules Init, Std, Lake, and
Lean), each of which writes to a shared SQLite database. If the database already exists, single
incrementally updates it by deleting the module's old rows and reinserting. Then the fromDb
command reads everything back and generates HTML in parallel. The old pipeline, which generated
HTML directly during module analysis, is removed.

The database makes documentation data available to other tools. Verso can query it for docstrings
instead of consulting the Lean environment, and future tools can use it in ways we haven't
anticipated yet. It also means that HTML generation has access to the full set of declarations
across all modules, which improves the heuristic insertion of links in code.

The Database

Within a module, each item (declaration, module doc, constructor, structure field) is assigned a
sequential position starting from 0. The composite key (module_name, position) is the primary
key for most tables. Constructors and structure fields are interleaved between their parent
declarations, so positions are not contiguous across top-level members. HTML generation reconstructs
module members by querying name_info and module_docs_Markdown, ordered by position.

The database schema is versioned by two hashes: a DDL hash that detects changes to table
definitions, and a type hash that detects changes to Lean types serialized as blobs (like
RenderedCode and RenderedCode.Tag). If either hash doesn't match, the database is rejected with
an error message asking the user to rebuild. The type hash is computed at compile time from a
string representation of the relevant inductive types, so adding a constructor to RenderedCode.Tag
will invalidate old databases automatically.

`RenderedCode`

Lean's pretty printer produces CodeWithInfos (a TaggedText SubexprInfo), which carries
expression types, universe levels, elaboration state, and other metadata that is too large to
serialize. RenderedCode is a TaggedText RenderedCode.Tag that keeps only what is needed for HTML
rendering: which tokens are declaration references (for linking), which are sorts (for linking to
the foundational types page), and which are keywords or strings (for syntax highlighting). The
conversion from CodeWithInfos to RenderedCode is lossy and not reversible. RenderedCode is
serialized to the database as a binary blob.

Verso Docstrings

Verso docstrings contain a tree of Doc.Block/Doc.Inline nodes with extension points
(ElabInline/ElabBlock) that hold opaque Dynamic values identified by Name. Different Lean
packages can register their own extension types, so there is no way to know all possible types at
compile time. The serialization uses a registry of handlers (DocstringValues) keyed by name. If a
handler exists for a given extension type, the payload is serialized with it. If not, only the name
is stored, and on deserialization the unknown extension is replaced with a sentinel value. In Verso
docstrings, the content underneath one of these extension types represents an alternative, simpler
rendering (e.g. plain text instead of highlighted code). This means the database remains readable
even if extension types are added or removed between versions. builtinDocstringValues includes
handlers for the extension types that ship with Lean. A future PR will add a plugin system for
registering additional handlers so authors of docstring extensions can control their serialization
and their rendering to HTML.

HTML Generation and Link Resolution

HTML is generated in parallel, with around 20 tasks that each have a database connection. This
number was chosen through experimentation.

When converting RenderedCode to HTML, the code needs to turn declaration names into links. For
names that appear directly in the global name index, this is straightforward. For names that don't
appear directly (private names, auto-generated auxiliary names like match discriminants and proof
terms), the code tries several heuristics: resolving private names to user-facing names, stripping
trailing auxiliary components to find a linkable parent, and falling back to a link to the module
page. The details and examples are documented in renderedCodeToHtmlAux in DocGen4.Output.Base.

Future Possibilities

Here are some useful directions that are made possible by this PR, but not implemented in it.

HTML Simplification

Today, things like the instances list are generated at run time via JavaScript because HTML files don't have a global view of the project. With this new model, the HTML can be straightforwardly generated. To ease comparison with the existing code, that feature is not implemented in this PR.

Plugin API

Having documentation in a database solves an immediate need in Verso brought about by the module system: it's no longer possible to extract docstrings from environments now that they're in the server olean, but a SQLite file provides for easy retrieval of them.

Additionally, there are more and more use cases for extending doc-gen4:

Verso docstrings are extensible and allow custom elements that should be renderable (e.g. correctly highlighted Lean code with reliable links, diagrams, etc).
Custom indices like the tactics overview. Tactics ship with Lean, and make sense as a built-in feature, but it would be nice if Mathlib could have an overview for library notes, or Verso could have an overview for documentation language extensions.
Custom attributes could provide a way to show themselves on declaration documentation, rather than just showing a hard-coded list of them.

These plugins need to be used at multiple times in doc-gen4:

During the analysis phase, they need to check the environment for information like new tactics or library notes.
When rendering a particular module's members, they need to check for the presence of attributes.
When creating the navigation bar, they need to provide new pages.

A plugin is a structure that has a field for each of these interpolation points. Instead of having a fixed Main.lean, a custom Lake target discovers the registered plugins in all packages in the workspace and generates a Main that includes calls to them.

The database model makes this much easier to implement. Plugins can add their own tables to the DB and write to them in the analysis phase, and they can be invoked with a DB handle at various points during HTML generation.

Incrementality and Caching

It would be possible to build each package's documentation DB separately. For instance, the core docs DB and Mathlib DBs could be part of the Mathlib cache. Then, at HTML generation time, the HTML generation process could ATTACH each package's documentation database and generate HTML for all of them at once, rather than re-analyzing all of Mathlib.

Validation

To check that the database-generated HTML matches the old pipeline's output, there is a Python
comparison script (scripts/check_diff_soup.py) that was used extensively during development and
will not be present in the squashed commit history, so it is described here in some detail.

A simple bit-for-bit comparison (even after normalizing with tidy) is not sufficient because the
database-generated HTML is intentionally different in some ways. It has more links than the old
output because it can resolve names across all modules rather than just within the transitive
imports of the module being processed. It also adds id attributes to inherited structure fields,
wraps extends clause parents in  elements for linking, deduplicates import lists, and drops
empty equations sections. These are all improvements, but they mean that a naïve diff would report
thousands of false positives.

The script parses both HTML trees and walks them in parallel, matching elements by position within
their parent. It can distinguish between a changed attribute and a removed element, which makes it
precise enough to enforce specific rules about which differences are acceptable.

The rules are declarative. Each rule is a function that receives a DiffContext (containing the
old and new elements, their ancestor chains, and pre-collected sets of valid link targets from both
directories) and returns a reason string if the difference is acceptable, or None if it is not.
The rules cover the following cases:

Broken links in the old version may be removed, replaced with , or have
their href changed.
 may be replaced by <a> if the new link target is valid, since the
database version can resolve names that the old pipeline could not.
New <a> elements may be added if their targets are valid.
href attributes pointing to private names (_private. in the URL) may change if the new
target is valid, since the new version resolves to the public declaration.
href attributes may change as long as the anchor fragment is the same and the new target is
valid, covering cases where a name resolves to a different module.
file:/// hrefs to .lean source files may differ in their temp directory prefix.
Duplicate <li> elements in import lists may be deduplicated.
Empty equations sections (with no equation items) may be removed.
Declarations at the same source position may appear in a different order, since the old
pipeline's ordering was nondeterministic for declarations that share a position. This rule uses
the SQLite database itself (via the --db flag) to look up source positions.
Inherited structure fields may gain id attributes, since the new version adds them as link
targets.
Extends clause parents may be wrapped in  elements to create link targets for
parent projection names.

For <code> elements (declaration signatures and types), the tool uses a separate comparison
algorithm that walks through children of both elements simultaneously while matching text content,
tracking wrapper elements (<a> and ) as it goes. This is more precise than the general
tree-walking comparison because it can verify that the underlying text content is identical even
when the wrapping structure differs.

In addition to the per-file HTML comparison, the script also checks several other things. It
compares static assets (CSS, JavaScript, fonts, images) byte for byte. It compares the search index
(declaration-data.bmp, which is JSON) with domain-specific rules: module entries are compared by
URL and import set, instance lists are compared as sets, and declaration entries are compared field
by field, with docLink differences accepted when both the old and new targets are valid anchors.
Other JSON files are compared by structural equality.

The script also runs a declaration census: it queries the database for every declaration marked as
rendered (render = 1 in name_info) and checks that a corresponding anchor exists in the
generated HTML. This catches cases where a declaration is in the database but missing from the
output, which would not be detected by the per-file comparison since there is no old file to
compare against. Finally, it does a bidirectional target coverage check, comparing the set of
anchored link targets between the old and new HTML to flag any targets that were dropped.

The output is organized per file: for each file with differences, the script prints the rejected
differences (with unified diffs of the parent element) and, in verbose mode, the accepted ones
with their rule names. At the end it prints a summary with counts for HTML files, data files,
static assets, the declaration census, and target coverage, followed by a breakdown of accepted
differences by rule with a sample for each to facilitate inspection.

Results

The results of the script show one accepted difference in data files. declaration-data.bmp swaps the link destinations for two instances: CategoryTheory.Abelian.instIsStableUnderBaseChangeEpimorphisms and CategoryTheory.Abelian.instIsStableUnderCobaseChangeMonomorphisms. This is because the relevant instances exist in both modules in Mathlib (link 1 link 2). The generated destination in the HTML depends just on the order that doc-gen4 happens to visit the modules in. I'm not sure why there isn't some kind of conflict when importing them, but I don't think this differences is a bug.

It also shows many differences in the files. Most of them are source link targets that just point at different temp files. There are examples of each category of allowed difference to make it easier to understand.

Loaded 383233 declaration positions from /Users/davidc/tmp/compare-mathlib-docs/new/api-docs.db
Comparing /Users/davidc/tmp/compare-mathlib-docs/old/doc vs /Users/davidc/tmp/compare-mathlib-docs/new/doc
Scanning directories... (1259.7ms)
  HTML files in both: 10183
  HTML files only in dir1: 0
  HTML files only in dir2: 0
  Data files in both: 1
  Data files only in dir1: 0
  Data files only in dir2: 0
  Static assets in both: 13
  Static assets only in dir1: 0
  Static assets only in dir2: 0
Extracting link targets... (97181.2ms)
  Targets in dir1: 541566
  Targets in dir2: 543401

Comparing 1 data files... (1189.2ms)
  Identical: 0
  Different: 0

  declarations/declaration-data.bmp: declarations: 2 multi-module (both valid)
          CategoryTheory.Abelian.instIsStableUnderBaseChangeEpimorphisms: multi-module (both valid)
            old: ./Mathlib/CategoryTheory/Abelian/Monomorphisms.html#CategoryTheory.Abelian.instIsStableUnderBaseChangeEpimorphisms
            new: ./Mathlib/CategoryTheory/Abelian/CommSq.html#CategoryTheory.Abelian.instIsStableUnderBaseChangeEpimorphisms
          CategoryTheory.Abelian.instIsStableUnderCobaseChangeMonomorphisms: multi-module (both valid)
            old: ./Mathlib/CategoryTheory/Abelian/Monomorphisms.html#CategoryTheory.Abelian.instIsStableUnderCobaseChangeMonomorphisms
            new: ./Mathlib/CategoryTheory/Abelian/CommSq.html#CategoryTheory.Abelian.instIsStableUnderCobaseChangeMonomorphisms

Declaration census... (393.6ms)
  Declarations checked: 362592
  Missing from HTML: 0

Comparing 13 static assets... (3.8ms)
  Identical: 13
  Different: 0

Target coverage... (183.7ms)
  Anchored targets in dir1: 531383
  Anchored targets in dir2: 533218
  Dropped (in old, not new): 0
  Added (in new, not old): 1835

Comparing 10183 HTML files...

============================================================
SUMMARY (1310494.3ms)
============================================================
  HTML files compared: 10183
  Files with differences: 8750
  Total rejected differences: 0
  Total accepted differences: 309699
  Files only in dir1: 0
  Files only in dir2: 0
  Data files compared: 0
  Data files identical: 0
  Data files different: 0
  Data files only in dir1: 0
  Data files only in dir2: 0
  Static assets compared: 13
  Static assets identical: 13
  Static assets different: 0
  Static assets only in dir1: 0
  Static assets only in dir2: 0
  Declaration census: 362592 checked, 0 missing
  Target coverage: 531383 old anchors, 533218 new anchors, 0 dropped, 1835 added

  Accepted differences by rule:
    allow_lean_file_href_change: 291,127
      e.g. Mathlib/Algebra/BigOperators/Expect.html: attribute 'href'
             <div class="gh_link">
           -<a href="file:///tmp/tmp.Y0Ko4VPaev/mathproject/.lake/packages/mathlib/Mathlib/Algebra/BigOperators/Expect.lean">
           +<a href="file:///tmp/tmp.3TLeCnu8q7/mathproject/.lake/packages/mathlib/Mathlib/Algebra/BigOperators/Expect.lean">
             source
             </a>
      e.g. Mathlib/Computability/Ackermann.html: attribute 'href'
             <div class="gh_link">
           -<a href="file:///tmp/tmp.Y0Ko4VPaev/mathproject/.lake/packages/mathlib/Mathlib/Computability/Ackermann.lean">
           +<a href="file:///tmp/tmp.3TLeCnu8q7/mathproject/.lake/packages/mathlib/Mathlib/Computability/Ackermann.lean">
             source
             </a>
      e.g. Mathlib/RingTheory/NonUnitalSubsemiring/Basic.html: attribute 'href'
             <div class="gh_link">
           -<a href="file:///tmp/tmp.Y0Ko4VPaev/mathproject/.lake/packages/mathlib/Mathlib/RingTheory/NonUnitalSubsemiring/Basic.lean">
           +<a href="file:///tmp/tmp.3TLeCnu8q7/mathproject/.lake/packages/mathlib/Mathlib/RingTheory/NonUnitalSubsemiring/Basic.lean">
             source
             </a>
      e.g. Mathlib/NumberTheory/Cyclotomic/Basic.html: attribute 'href'
             <div class="gh_link">
           -<a href="file:///tmp/tmp.Y0Ko4VPaev/mathproject/.lake/packages/mathlib/Mathlib/NumberTheory/Cyclotomic/Basic.lean">
           +<a href="file:///tmp/tmp.3TLeCnu8q7/mathproject/.lake/packages/mathlib/Mathlib/NumberTheory/Cyclotomic/Basic.lean">
             source
             </a>
      e.g. Mathlib/Algebra/Polynomial/EraseLead.html: attribute 'href'
             <div class="gh_link">
           -<a href="file:///tmp/tmp.Y0Ko4VPaev/mathproject/.lake/packages/mathlib/Mathlib/Algebra/Polynomial/EraseLead.lean">
           +<a href="file:///tmp/tmp.3TLeCnu8q7/mathproject/.lake/packages/mathlib/Mathlib/Algebra/Polynomial/EraseLead.lean">
             source
             </a>
    allow_extends_id_wrapper: 7,124
      e.g. Mathlib/Algebra/Quandle.html: element_removed
           +<span id="UnitalShelf.toOne">
             <span class="fn">
             <a href="../.././Init/Prelude.html#One">
           ...
             </span>
             </span>
           +</span>
      e.g. Mathlib/Geometry/Manifold/Algebra/Structures.html: element_removed
           +<span id="ContMDiffRing.toContMDiffAdd">
             <span class="fn">
             <a href="../../../.././Mathlib/Geometry/Manifold/Algebra/Monoid.html#ContMDiffAdd">
           ...
             </span>
             </span>
           +</span>
      e.g. Mathlib/Algebra/Group/Defs.html: element_replaced
           +<span id="CancelCommMonoid.toCommMonoid">
             <span class="fn">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#CommMonoid">
           ...
             </span>
             </span>
           +</span>
      e.g. Lake/Config/ConfigDecl.html: element_removed
           +<span id="Lake.NConfigDecl.toPConfigDecl">
             <span class="fn">
             <a href="../.././Lake/Config/ConfigDecl.html#Lake.PConfigDecl">
           ...
             </span>
             </span>
           +</span>
      e.g. Mathlib/Topology/Metrizable/CompletelyMetrizable.html: attribute 'id'
             </span>
              
           +<span id="TopologicalSpace.UpgradedIsCompletelyMetrizableSpace.toMetricSpace">
             <span class="fn">
             <a href="../../.././Mathlib/Topology/MetricSpace/Defs.html#MetricSpace">
           ...
             </span>
             </span>
           +</span>
             , 
           +<span id="TopologicalSpace.UpgradedIsCompletelyMetrizableSpace.toCompleteSpace">
             <span class="fn">
             <a href="../../.././Mathlib/Topology/UniformSpace/Cauchy.html#CompleteSpace">
           ...
             <span class="fn">
             X
           +</span>
             </span>
             </span>
    compare_code_elements: 6,550
      e.g. Mathlib/Data/Ordmap/Invariants.html: structural
             The 
             <code>
           -<a href="../../.././Mathlib/Data/Ordmap/Invariants.html#Ordnode.Balanced">
           +<a href="../../.././Mathlib/Analysis/LocallyConvex/Basic.html#Balanced">
             Balanced
             </a>
      e.g. Std/Sync/RecursiveMutex.html: structural
             <span class="fn">
             (
           -<a href="../.././Std/Sync/RecursiveMutex.html#_private.Std.Sync.RecursiveMutex.0.Std.RecursiveMutex.ref">
           +<a href="../.././Std/Sync/RecursiveMutex.html#Std.RecursiveMutex">
             Std.RecursiveMutex.ref✝
             </a>
      e.g. Lake/Config/FacetConfig.html: structural
             </a>
              
           -<a href="../.././Lake/Config/FacetConfig.html#Lake.instTypeNameModuleFacetDecl.unsafe_impl_3">
           +<a href="../.././Lake/Config/FacetConfig.html#Lake.instTypeNameModuleFacetDecl">
             Lake.instTypeNameModuleFacetDecl.unsafe_impl_3
             </a>
      e.g. Mathlib/Analysis/BoxIntegral/Box/SubboxInduction.html: structural
              is true. See also 
             <code>
           +<a href="../../../.././Mathlib/Analysis/BoxIntegral/Partition/SubboxInduction.html#BoxIntegral.Box.subbox_induction_on">
             BoxIntegral.Box.subbox_induction_on
           +</a>
             </code>
              for a version using

             <code>
           +<a href="../../../.././Mathlib/Analysis/BoxIntegral/Partition/SubboxInduction.html#BoxIntegral.Prepartition.splitCenter">
             BoxIntegral.Prepartition.splitCenter
           +</a>
             </code>
              instead of 
      e.g. Mathlib/Computability/Reduce.html: structural
             </span>
              
           -<a href="../.././Mathlib/Computability/Reduce.html#ManyOneDegree.instLE._proof_1">
           +<a href="../.././Mathlib/Computability/Reduce.html#ManyOneDegree.instLE">
             ManyOneDegree.instLE._proof_1
             </a>
    allow_href_change_if_old_broken: 2,140
      e.g. Mathlib/AlgebraicTopology/SimplicialSet/NerveAdjunction.html: attribute 'href'
             </span>
              
           -<a href="../../.././Mathlib/AlgebraicTopology/SimplicialSet/NerveAdjunction.html#_private.Mathlib.AlgebraicTopology.SimplicialSet.NerveAdjunction.0._proof_12">
           +<a href="../../.././Mathlib/AlgebraicTopology/SimplicialSet/NerveAdjunction.html">
             _proof_12✝⁶
             </a>
              
           -<a href="../../.././Mathlib/AlgebraicTopology/SimplicialSet/NerveAdjunction.html#_private.Mathlib.AlgebraicTopology.SimplicialSet.NerveAdjunction.0._proof_15">
           +<a href="../../.././Mathlib/AlgebraicTopology/SimplicialSet/NerveAdjunction.html">
             _proof_15✝²
             </a>
      e.g. Mathlib/CategoryTheory/ComposableArrows/Basic.html: attribute 'href'
             </span>
              
           -<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html#CategoryTheory.ComposableArrows.homMk₁._proof_4">
           +<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html#CategoryTheory.ComposableArrows.homMk₁">
             homMk₁._proof_4
             </a>
      e.g. Mathlib/CategoryTheory/ComposableArrows/Basic.html: attribute 'href'
             </span>
              
           -<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html#_private.Mathlib.CategoryTheory.ComposableArrows.Basic.0._proof_445">
           +<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html">
             _proof_445✝¹
             </a>
      e.g. Mathlib/CategoryTheory/ComposableArrows/Basic.html: attribute 'href'
             </span>
              
           -<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html#_private.Mathlib.CategoryTheory.ComposableArrows.Basic.0._proof_356">
           +<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html">
             _proof_356✝¹
             </a>
              
           -<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html#_private.Mathlib.CategoryTheory.ComposableArrows.Basic.0._proof_445">
           +<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html">
             _proof_445✝³
             </a>
      e.g. Mathlib/CategoryTheory/ComposableArrows/Basic.html: attribute 'href'
             </span>
              
           -<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html#CategoryTheory.ComposableArrows.homMk₁._proof_4">
           +<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html#CategoryTheory.ComposableArrows.homMk₁">
             homMk₁._proof_4
             </a>
              
           -<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html#_private.Mathlib.CategoryTheory.ComposableArrows.Basic.0._proof_354">
           +<a href="../../.././Mathlib/CategoryTheory/ComposableArrows/Basic.html">
             _proof_354✝³
             </a>
    allow_reorder_same_source_position: 1,110
      e.g. Lean/Meta/Basic.html: Lean.Meta.instInhabitedExprParamInfo ↔ Lean.Meta.instInhabitedExprParamInfo.default (line 2145)
      e.g. Lean/Environment.html: Lean.instInhabitedEnvExtension ↔ Lean.instInhabitedEnvExtension.default (line 1300)
      e.g. Lean/Widget/TaggedText.html: Lean.Widget.instFromJsonTaggedText ↔ Lean.Widget.instFromJsonTaggedText.fromJson (line 29)
      e.g. Lean/Environment.html: Lean.instInhabitedEnvExtension ↔ Lean.instInhabitedEnvExtension.default (line 1300)
      e.g. Lean/Elab/StructInst.html: Lean.Elab.Term.StructInst.instInhabitedFieldLHS.default ↔ Lean.Elab.Term.StructInst.instInhabitedFieldLHS (line 272)
    allow_duplicate_li_removal_in_imports: 1,054
      e.g. Init/GrindInstances/Ring/Fin.html: text
           -<a href="../../.././Init/GrindInstances/ToInt.html">
           +<a href="../../.././Init/Data/Fin/Lemmas.html">
           -Init.GrindInstances.ToInt
           +Init.Data.Fin.Lemmas
             </a>
      e.g. Init/Data/Iterators/Lemmas/Consumers/Collect.html: attribute 'href'
             <li>
           -<a href="../../../../.././Init/Data/Iterators/Consumers/Collect.html">
           +<a href="../../../../.././Init/Data/Iterators/Lemmas/Basic.html">
           -Init.Data.Iterators.Consumers.Collect
           +Init.Data.Iterators.Lemmas.Basic
             </a>
             </li>
      e.g. Init/Data/Iterators/Lemmas/Combinators/Monadic/FlatMap.html: attribute 'href'
             <li>
           -<a href="../../../../../.././Init/Data/Iterators/Consumers/Monadic/Collect.html">
           +<a href="../../../../../.././Init/Data/Iterators/Lemmas/Consumers/Monadic.html">
           -Init.Data.Iterators.Consumers.Monadic.Collect
           +Init.Data.Iterators.Lemmas.Consumers.Monadic
             </a>
             </li>
      e.g. Init/Data/BitVec/Lemmas.html: text
           -<a href="../../.././Init/Data/BitVec/Basic.html">
           +<a href="../../.././Init/Data/BitVec/BasicAux.html">
           -Init.Data.BitVec.Basic
           +Init.Data.BitVec.BasicAux
             </a>
      e.g. Std/Data/DHashMap/Raw.html: element_removed
             </a>
             </li>
           -<li>
           -<a href="../../.././Std/Data/DHashMap/Internal/Defs.html">
           -Std.Data.DHashMap.Internal.Defs
           -</a>
           -</li>
             </ul>
    allow_added_link_with_valid_target: 276
      e.g. Mathlib/Analysis/Normed/Unbundled/RingSeminorm.html: element_added
           +<span id="MulRingSeminorm.toMonoidWithZeroHom">
             <span class="fn">
             R
             </span>
           + 
           +<a href="../../../.././Mathlib/Algebra/GroupWithZero/Hom.html#MonoidWithZeroHom">
           +→*₀
           +</a>
           + 
           +<a href="../../../.././Mathlib/Data/Real/Basic.html#Real">
           +ℝ
           +</a>
           +</span>
      e.g. Mathlib/RingTheory/Polynomial/Eisenstein/Distinguished.html: element_added
           +<span class="fn">
             <span class="fn">
             f
             </span>
           +.
           +<a href="../../../.././Mathlib/RingTheory/Polynomial/Eisenstein/Basic.html#Polynomial.IsWeaklyEisensteinAt">
           +IsWeaklyEisensteinAt
           +</a>
           +</span>
      e.g. Mathlib/Algebra/Lie/Basic.html: element_added
           +<span id="LieModuleHom.toLinearMap">
             <span class="fn">
             M
             </span>
           + 
           +<a href="../../.././Mathlib/Algebra/Module/LinearMap/Defs.html#LinearMap">
           +→ₗ[
           +</a>
           +<span class="fn">
           +R
           +</span>
           +<a href="../../.././Mathlib/Algebra/Module/LinearMap/Defs.html#LinearMap">
           +]
           +</a>
           + 
           +<span class="fn">
           +N
           +</span>
           +</span>
      e.g. Mathlib/Topology/Algebra/Algebra.html: element_added
           +<span id="ContinuousAlgHom.toAlgHom">
             <span class="fn">
             A
             </span>
           + 
           +<a href="../../.././Mathlib/Algebra/Algebra/Hom.html#AlgHom">
           +→ₐ[
           +</a>
           +<span class="fn">
           +R
           +</span>
           +<a href="../../.././Mathlib/Algebra/Algebra/Hom.html#AlgHom">
           +]
           +</a>
           + 
           +<span class="fn">
           +B
           +</span>
           +</span>
      e.g. Mathlib/Algebra/Group/Equiv/Defs.html: element_added
           +<span id="AddEquiv.toAddHom">
           +<span class="fn">
           +A
           +</span>
           + 
           +<a href="../../../.././Mathlib/Algebra/Group/Hom/Defs.html#AddHom">
           +→ₙ+
           +</a>
           + 
             <span class="fn">
             B
             </span>
           +</span>
    allow_inherited_field_id: 267
      e.g. Mathlib/Algebra/Ring/Defs.html: attribute 'id'
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="NonUnitalNonAssocRing.left_distrib">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Ring/Defs.html#Distrib.left_distrib">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="NonUnitalNonAssocRing.right_distrib">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Ring/Defs.html#Distrib.right_distrib">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="NonUnitalNonAssocRing.zero_mul">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/GroupWithZero/Defs.html#MulZeroClass.zero_mul">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="NonUnitalNonAssocRing.mul_zero">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/GroupWithZero/Defs.html#MulZeroClass.mul_zero">
      e.g. Mathlib/Algebra/Ring/Defs.html: attribute 'id'
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="NonUnitalCommRing.mul_comm">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#CommMagma.mul_comm">
      e.g. Mathlib/Algebra/Ring/Defs.html: attribute 'id'
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="NonUnitalCommSemiring.mul_comm">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#CommMagma.mul_comm">
      e.g. Mathlib/Algebra/Field/Defs.html: attribute 'id'
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="DivisionSemiring.div_eq_mul_inv">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.div_eq_mul_inv">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="DivisionSemiring.zpow">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.zpow">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="DivisionSemiring.zpow_zero'">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.zpow_zero'">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="DivisionSemiring.zpow_succ'">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.zpow_succ'">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="DivisionSemiring.zpow_neg'">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.zpow_neg'">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="DivisionSemiring.inv_zero">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/GroupWithZero/Defs.html#GroupWithZero.inv_zero">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="DivisionSemiring.mul_inv_cancel">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/GroupWithZero/Defs.html#GroupWithZero.mul_inv_cancel">
           ...
             Unless there is a risk of a 
             <code>
           -Module ℚ≥0 _
           +<a href="../../.././Mathlib/Algebra/Module/Defs.html#Module">
           +Module
           +</a>
           + ℚ≥0 _
             </code>
              instance diamond, write 
      e.g. Mathlib/Algebra/GroupWithZero/Defs.html: attribute 'id'
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="CommGroupWithZero.div_eq_mul_inv">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.div_eq_mul_inv">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="CommGroupWithZero.zpow">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.zpow">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="CommGroupWithZero.zpow_zero'">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.zpow_zero'">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="CommGroupWithZero.zpow_succ'">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.zpow_succ'">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="CommGroupWithZero.zpow_neg'">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/Group/Defs.html#DivInvMonoid.zpow_neg'">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="CommGroupWithZero.inv_zero">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/GroupWithZero/Defs.html#GroupWithZero.inv_zero">
           ...
             </div>
             </li>
           -<li class="structure_field inherited_field">
           +<li class="structure_field inherited_field" id="CommGroupWithZero.mul_inv_cancel">
             <div class="structure_field_info">
             <a href="../../.././Mathlib/Algebra/GroupWithZero/Defs.html#GroupWithZero.mul_inv_cancel">
    allow_empty_equations_removal: 40
      e.g. Mathlib/Data/List/Pi.html: attribute 'class'
             <div class="def">
             <div class="gh_link">
           -<a href="file:///tmp/tmp.Y0Ko4VPaev/mathproject/.lake/packages/mathlib/Mathlib/Data/List/Pi.lean">
           +<a href="file:///tmp/tmp.3TLeCnu8q7/mathproject/.lake/packages/mathlib/Mathlib/Data/List/Pi.lean">
             source
             </a>
           ...
              is the trivial dependent function out of the empty list.
             </p>
           -<details>
           -<summary>
           -Equations
           -</summary>
           -<ul class="equations">
           -</ul>
           -</details>
             <details class="instances-for-list" id="instances-for-list-List.Pi.nil">
             <summary>
      e.g. Mathlib/Data/Vector3.html: text
             <summary>
           -Equations
           +Instances For
             </summary>
      e.g. Mathlib/Data/Vector3.html: attribute 'id'
             <div class="def">
             <div class="gh_link">
           -<a href="file:///tmp/tmp.Y0Ko4VPaev/mathproject/.lake/packages/mathlib/Mathlib/Data/Vector3.lean">
           +<a href="file:///tmp/tmp.3TLeCnu8q7/mathproject/.lake/packages/mathlib/Mathlib/Data/Vector3.lean">
             source
             </a>
           ...
             The empty vector
             </p>
           -<details>
           -<summary>
           -Equations
           -</summary>
           -<ul class="equations">
           -</ul>
           -</details>
             <details class="instances-for-list" id="instances-for-list-Vector3.nil">
             <summary>
      e.g. Mathlib/Data/Multiset/Pi.html: element_removed
             <div class="def">
             <div class="gh_link">
           -<a href="file:///tmp/tmp.Y0Ko4VPaev/mathproject/.lake/packages/mathlib/Mathlib/Data/Multiset/Pi.lean">
           +<a href="file:///tmp/tmp.3TLeCnu8q7/mathproject/.lake/packages/mathlib/Mathlib/Data/Multiset/Pi.lean">
             source
             </a>
           ...
              is the trivial dependent function out of the empty
multiset.
             </p>
           -<details>
           -<summary>
           -Equations
           -</summary>
           -<ul class="equations">
           -</ul>
           -</details>
             <details class="instances-for-list" id="instances-for-list-Multiset.Pi.empty">
             <summary>
      e.g. Mathlib/Data/List/Pi.html: element_removed
             <div class="def">
             <div class="gh_link">
           -<a href="file:///tmp/tmp.Y0Ko4VPaev/mathproject/.lake/packages/mathlib/Mathlib/Data/List/Pi.lean">
           +<a href="file:///tmp/tmp.3TLeCnu8q7/mathproject/.lake/packages/mathlib/Mathlib/Data/List/Pi.lean">
             source
             </a>
           ...
              is the trivial dependent function out of the empty list.
             </p>
           -<details>
           -<summary>
           -Equations
           -</summary>
           -<ul class="equations">
           -</ul>
           -</details>
             <details class="instances-for-list" id="instances-for-list-List.Pi.nil">
             <summary>
    allow_href_change_from_private: 4
      e.g. Mathlib/Algebra/Category/ModuleCat/Basic.html: attribute 'href'
             </a>
              
           -<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#_private.Mathlib.Algebra.Category.ModuleCat.Basic.0.ModuleCat.Hom.mk">
           +<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#ModuleCat.Hom">
             {
             </a>
           ...
             </span>
              
           -<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#_private.Mathlib.Algebra.Category.ModuleCat.Basic.0.ModuleCat.Hom.mk">
           +<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#ModuleCat.Hom">
             }
             </a>
      e.g. Mathlib/Algebra/Category/ModuleCat/Basic.html: attribute 'href'
             </a>
              
           -<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#_private.Mathlib.Algebra.Category.ModuleCat.Basic.0.ModuleCat.Hom.mk">
           +<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#ModuleCat.Hom">
             {
             </a>
           ...
             </span>
              
           -<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#_private.Mathlib.Algebra.Category.ModuleCat.Basic.0.ModuleCat.Hom.mk">
           +<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#ModuleCat.Hom">
             }
             </a>
      e.g. Mathlib/Algebra/Category/ModuleCat/Basic.html: attribute 'href'
             </a>
              
           -<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#_private.Mathlib.Algebra.Category.ModuleCat.Basic.0.ModuleCat.Hom.mk">
           +<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#ModuleCat.Hom">
             {
             </a>
           ...
             </span>
              
           -<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#_private.Mathlib.Algebra.Category.ModuleCat.Basic.0.ModuleCat.Hom.mk">
           +<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#ModuleCat.Hom">
             }
             </a>
      e.g. Mathlib/Algebra/Category/ModuleCat/Basic.html: attribute 'href'
             </a>
              
           -<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#_private.Mathlib.Algebra.Category.ModuleCat.Basic.0.ModuleCat.Hom.mk">
           +<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#ModuleCat.Hom">
             {
             </a>
           ...
             </span>
              
           -<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#_private.Mathlib.Algebra.Category.ModuleCat.Basic.0.ModuleCat.Hom.mk">
           +<a href="../../../.././Mathlib/Algebra/Category/ModuleCat/Basic.html#ModuleCat.Hom">
             }
             </a>
    allow_span_fn_to_link: 4
      e.g. Mathlib/AlgebraicTopology/ModelCategory/BrownLemma.html: element_replaced
           +<span class="fn">
             <span class="fn">
             (
           ...
             )
             </span>
           +.
           +<a href="../../.././Mathlib/CategoryTheory/MorphismProperty/Factorization.html#CategoryTheory.MorphismProperty.MapFactorizationData">
           +MapFactorizationData
           +</a>
           +</span>
      e.g. Mathlib/AlgebraicTopology/ModelCategory/BrownLemma.html: element_replaced
           +<span class="fn">
             <span class="fn">
             (
           ...
             )
             </span>
           +.
           +<a href="../../.././Mathlib/CategoryTheory/MorphismProperty/Factorization.html#CategoryTheory.MorphismProperty.MapFactorizationData">
           +MapFactorizationData
           +</a>
           +</span>
      e.g. Mathlib/CategoryTheory/Limits/Sifted.html: element_replaced
           +<span class="fn">
             <span class="fn">
             (
           ...
             )
             </span>
           +.
           +<a href="../../.././Mathlib/CategoryTheory/Limits/Final.html#CategoryTheory.Functor.Final">
           +Final
           +</a>
           +</span>
      e.g. Mathlib/GroupTheory/GroupAction/Hom.html: element_replaced
           +<span id="DistribMulActionHom.toAddMonoidHom">
             <span class="fn">
           -⇑
           +A
           +</span>
           + 
           +<a href="../../.././Mathlib/Algebra/Group/Hom/Defs.html#AddMonoidHom">
           +→+
           +</a>
           + 
             <span class="fn">
           -φ
           +B
             </span>
             </span>
    allow_href_change_same_anchor_valid_target: 3
      e.g. Mathlib/CategoryTheory/Abelian/Monomorphisms.html: attribute 'href'
             <span class="decl_name">
           -<a class="break_within" href="../../.././Mathlib/CategoryTheory/Abelian/Monomorphisms.html#CategoryTheory.Abelian.instIsStableUnderCobaseChangeMonomorphisms">
           +<a class="break_within" href="../../.././Mathlib/CategoryTheory/Abelian/CommSq.html#CategoryTheory.Abelian.instIsStableUnderCobaseChangeMonomorphisms">
             <span class="name">
             CategoryTheory
      e.g. Mathlib/CategoryTheory/Abelian/Monomorphisms.html: attribute 'href'
             <span class="decl_name">
           -<a class="break_within" href="../../.././Mathlib/CategoryTheory/Abelian/Monomorphisms.html#CategoryTheory.Abelian.instIsStableUnderBaseChangeEpimorphisms">
           +<a class="break_within" href="../../.././Mathlib/CategoryTheory/Abelian/CommSq.html#CategoryTheory.Abelian.instIsStableUnderBaseChangeEpimorphisms">
             <span class="name">
             CategoryTheory
      e.g. Mathlib/Data/List/Basic.html: attribute 'href'
             <span class="decl_name">
           -<a class="break_within" href="../../.././Mathlib/Data/List/Basic.html#List.take_one_drop_eq_of_lt_length">
           +<a class="break_within" href="../../.././Mathlib/Data/List/TakeDrop.html#List.take_one_drop_eq_of_lt_length">
             <span class="name">
             List

Review Process

Once this PR is acceptable, we should archive the comparison script somewhere and delete it prior to the squash merge. It won't be useful to compare two docsets from the database version, but it could be adapted to be useful for that purpose at some point.

This PR adds a SQLite database that contains all of the documentation info.

david-christiansen · 2026-01-19T05:43:58Z

!bench

leanprover-radar · 2026-01-19T05:44:34Z

Benchmark results for 7f88ca7 against 837f89a are in! @david-christiansen

No significant changes detected.

david-christiansen · 2026-01-19T07:29:33Z

Significance detection isn't going yet due to there not being enough data. The result for this version isn't great, so more work is needed:


mathlib-docs	//	instructions	258.8T	+175.4T	+210.4%		runner-mathlib1
mathlib-docs	//	maxrss	6 GiB	-4 MiB	-0.1%	B	runner-mathlib1
mathlib-docs	//	task-clock	14h 12m 51s	+10h 25m 39s	+275.4%	s	runner-mathlib1
mathlib-docs	//	wall-clock	12m 17s	+7m 25s	+152.8%	s	runner-mathlib1
own-docs	//	instructions	3.7T	+50.7G	+1.4%		runner-mathlib1
own-docs	//	maxrss	3 GiB	-21 MiB	-0.6%	B	runner-mathlib1
own-docs	//	task-clock	5m 51s	+6s	+1.9%	s	runner-mathlib1
own-docs	//	wall-clock	2m 39s	+2s	+1.3%	s	runner-mathlib1
radar/run/main	//	time	17m 9s	+7m 40s	+80.9%	s	runner-mathlib1
radar/run/main/script	//	time	17m 8s	+7m 40s	+81.1%	s	runner-mathlib1

task-clock and wall-clock for Mathlib builds are the most important measurements here.

Experiment to see if slowdown due to database file contention

david-christiansen · 2026-01-20T11:36:34Z

!bench

leanprover-radar · 2026-01-20T11:37:47Z

Benchmark results for 9272ece against 837f89a are in! @david-christiansen

No significant changes detected.

david-christiansen · 2026-01-20T12:10:53Z

!bench

leanprover-radar · 2026-01-20T12:11:34Z

Benchmark results for 418fa51 against 837f89a are in! @david-christiansen

No significant changes detected.

david-christiansen · 2026-01-20T12:34:24Z

!bench

leanprover-radar · 2026-01-20T12:34:46Z

Benchmark results for 76667ea against 837f89a are in! @david-christiansen

No significant changes detected.

david-christiansen · 2026-01-20T13:02:08Z

!bench

leanprover-radar · 2026-01-20T13:03:06Z

Benchmark results for e429f8e against 837f89a are in! @david-christiansen

No significant changes detected.

This reverts commit 418fa51.

This reverts commit 9272ece.

david-christiansen · 2026-01-20T13:23:25Z

!bench

leanprover-radar · 2026-01-20T13:24:18Z

Benchmark results for e1d2a8b against 837f89a are in! @david-christiansen

No significant changes detected.

david-christiansen · 2026-01-20T13:28:03Z

As of e29f8e, it's:


mathlib-docs	//	instructions	84.6T	+1.2T	+1.5%
mathlib-docs	//	maxrss	6 GiB	-4 MiB	-0.1%	B
mathlib-docs	//	task-clock	3h 49m 49s	+2m 37s	+1.2%	s
mathlib-docs	//	wall-clock	4m 58s	+6s	+2.3%	s
own-docs	//	instructions	3.7T	+40.6G	+1.1%
own-docs	//	maxrss	3 GiB	+39 MiB	+1.2%	B
own-docs	//	task-clock	5m 55s	+10s	+3.1%	s
own-docs	//	wall-clock	2m 41s	+4s	+3.1%	s
radar/run/main	//	time	19m 9s	+9m 40s	+102.0%	s
radar/run/main/script	//	time	19m 8s	+9m 40s	+102.3%	s

Seems that the Mathlib cache was only getting partial values.

david-christiansen · 2026-01-20T14:03:33Z

With the original PR code, it's comparable:


mathlib-docs	//	instructions	84.6T	+1.2T	+1.4%
mathlib-docs	//	maxrss	6 GiB	-3 MiB	-0.0%	B
mathlib-docs	//	task-clock	3h 50m 4s	+2m 52s	+1.3%	s
mathlib-docs	//	wall-clock	4m 58s	+6s	+2.3%	s
own-docs	//	instructions	3.7T	+52.2G	+1.4%
own-docs	//	maxrss	3 GiB	+75 MiB	+2.3%	B
own-docs	//	task-clock	5m 51s	+7s	+2.1%	s
own-docs	//	wall-clock	2m 40s	+3s	+1.9%	s
radar/run/main	//	time	19m 10s	+9m 41s	+102.2%	s
radar/run/main/script	//	time	19m 9s	+9m 41s	+102.4%	s

Extensions are not presently handled, but the fallback data are saved.

This is the first step towards rendering HTML from the DB instead of directly. The serializable version of CodeWithInfos used here can be saved in the DB. The generated HTML is the same, modulo commit hashes and external URLs.

david-christiansen · 2026-01-27T06:40:47Z

!bench

leanprover-radar · 2026-01-27T06:41:21Z

Benchmark results for ec4cf3e against 837f89a are in! @david-christiansen

mathlib-docs//instructions: +1.3T (+1.6%)
mathlib-docs//maxrss: -2MiB (-0.0%)
mathlib-docs//task-clock: +3m 23s (+1.5%)
mathlib-docs//wall-clock: +7s (+2.7%)
own-docs//instructions: +91.8G (+2.5%)
own-docs//maxrss: +80MiB (+2.4%)
own-docs//task-clock: +11s (+3.4%)
own-docs//wall-clock: +4s (+2.7%)

No significant changes detected.

This is preliminary to generating HTML from the database. The output is still unchanged, modulo commit hashes and source URLs.

david-christiansen · 2026-01-27T07:06:08Z

!bench

leanprover-radar · 2026-01-27T07:07:06Z

Benchmark results for 9489cfd against 837f89a are in! @david-christiansen

mathlib-docs//instructions: +1.2T (+1.4%)
mathlib-docs//maxrss: -3MiB (-0.1%)
mathlib-docs//task-clock: +2m 37s (+1.2%)
mathlib-docs//wall-clock: +9s (+3.4%)
own-docs//instructions: +51.9G (+1.4%)
own-docs//maxrss: -821MiB (-24.6%)
own-docs//task-clock: +706ms (+0.2%)
own-docs//wall-clock: -378ms (-0.2%)

No significant changes detected.

The scripts indicate that the output is the same, modulo minor differences in automatic linking

david-christiansen · 2026-02-19T08:45:20Z

!bench

leanprover-radar · 2026-02-19T08:46:19Z

Benchmark results for c08bf78 against 13ecfbb are in! @david-christiansen

mathlib-docs//instructions: -1.4T (-1.78%)
mathlib-docs//maxrss: -4MiB (-0.07%)
mathlib-docs//task-clock: +17m 42s (+7.88%)
mathlib-docs//wall-clock: +3m 41s (+74.43%)
own-docs//instructions: -52.9G (-1.38%)
own-docs//maxrss: -850MiB (-25.27%)
own-docs//task-clock: +50s (+14.12%)
own-docs//wall-clock: +15s (+8.62%)

No significant changes detected.

It's only used single-threadedly today, but I'd hate to find out the hard way that it isn't.

david-christiansen · 2026-02-19T19:52:52Z

!bench

leanprover-radar · 2026-02-19T19:53:56Z

Benchmark results for beb37c1 against 13ecfbb are in! @david-christiansen

mathlib-docs//instructions: -1.4T (-1.84%)
mathlib-docs//maxrss: -5MiB (-0.08%)
mathlib-docs//task-clock: +17m 24s (+7.75%)
mathlib-docs//wall-clock: +3m 50s (+77.52%)
own-docs//instructions: -54.3G (-1.42%)
own-docs//maxrss: -852MiB (-25.33%)
own-docs//task-clock: +54s (+14.98%)
own-docs//wall-clock: +15s (+8.96%)

No significant changes detected.

david-christiansen · 2026-02-20T04:33:49Z

!bench

leanprover-radar · 2026-02-20T04:34:20Z

Benchmark results for 06dfc9d against 13ecfbb are in! @david-christiansen

mathlib-docs//instructions: -1.4T (-1.82%)
mathlib-docs//maxrss: -7MiB (-0.12%)
mathlib-docs//task-clock: +17m 34s (+7.82%)
mathlib-docs//wall-clock: +3m 46s (+76.23%)
own-docs//instructions: -53.6G (-1.40%)
own-docs//maxrss: -834MiB (-24.77%)
own-docs//task-clock: +52s (+14.59%)
own-docs//wall-clock: +15s (+8.68%)

No significant changes detected.

david-christiansen · 2026-02-20T05:35:53Z

DocGen4/Output/ToHtmlFormat.lean

+
+def escape (s : String) : String := Id.run do
+  let mut out := ""
+  let mut i := s.startPos
+  let mut j := s.startPos
+  while h : j ≠ s.endPos do
+    let c := j.get h
+    if let some esc := subst c then
+      out := out ++ s.extract i j ++ esc
+      j := j.next h
+      i := j
+    else
+      j := j.next h
+  if i = s.startPos then s  -- no escaping needed, return original
+  else out ++ s.extract i j
+where
+  subst : Char → Option String
+    | '&' => some "&amp;"
+    | '<' => some "&lt;"
+    | '>' => some "&gt;"
+    | '"' => some "&quot;"
+    | _ => none
+


The old implementation actually showed up in profiles - this isn't just an exercise in code-fancying.

hargoniX · 2026-02-20T10:12:51Z

lakefile.lean

-Direct and transitive dependencies.
-
-Loosely inspired by bazel's [depset](https://bazel.build/rules/lib/builtins/depset). -/
-abbrev DepSet (α) [Hashable α] [BEq α] := Array α × OrdHashSet α


Did you check that after deleting this #286 still works?

Assuming that the build.yml added in that PR tests it effectively, then it still works. Is there another way to double-check? The output when running that command locally is very similar to what was on the PR, but there's not quite enough context there to know what it did before the PR and how to check it more.

The docs facet is still returning the list of generated files (without duplicates), so I believe the relevant behavior is preserved.

hargoniX · 2026-02-20T10:13:40Z

lakefile.lean

@@ -23,8 +23,11 @@ require «UnicodeBasic» from git
 require Cli from git


@tydeu I would like to get your opinion on the lakefile. I don't trust myself enough with lake trickery to say it's correct for sure.

hargoniX · 2026-02-20T10:16:45Z

scripts/bench/own-docs/run

 TAR_ARGS=(doc)
 if [ -f .lake/build/api-docs.db ]; then
+  # Compact the DB into a single portable file (removes WAL/journal dependency)
+  sqlite3 .lake/build/api-docs.db "VACUUM"


How big is the database for mathlib? Is this a metric we should track if only not to blow up the amount of cache people have to use too much if they decide to cache the DB?

It's 717MB. With zstd compression, it's 96MB.

I'll rig up the benchmark to track both, just in case.

OK, the latest benchmark run includes the sizes of the DB for both doc-gen itself and Mathlib. It's not posted here by the bot, but clicking through shows it.

hargoniX · 2026-02-20T10:35:04Z

DocGen4/Output/DocString.lean

+private def isAutoGeneratedSuffix (s : String) : Bool :=
+  s == "rec" || s == "recOn" || s == "casesOn" || s == "noConfusion" ||
+  s == "noConfusionType" || s == "below" || s == "brecOn"


Would be great if we could consolidate this at some point :/

david-christiansen · 2026-02-20T14:13:33Z

!bench

leanprover-radar · 2026-02-20T14:13:40Z

Benchmark results for 91b4138 against 13ecfbb are in! @david-christiansen

mathlib-docs//instructions: -1.4T (-1.83%)
mathlib-docs//maxrss: -6MiB (-0.10%)
mathlib-docs//task-clock: +16m 44s (+7.45%)
mathlib-docs//wall-clock: +3m 46s (+76.31%)
own-docs//instructions: -53.8G (-1.40%)
own-docs//maxrss: -867MiB (-25.77%)
own-docs//task-clock: +50s (+13.88%)
own-docs//wall-clock: +15s (+8.84%)

No significant changes detected.

tydeu

I don't claim to fully understand the intricacies of what the doc-gen lakefile is doing. However, as far as I can tell from a thorough readthrough, it looks good to me.

lakefile.lean

Co-authored-by: Mac Malone <tydeu@hatpress.net>

david-christiansen · 2026-02-23T15:43:37Z

!bench

leanprover-radar · 2026-02-23T15:44:35Z

Benchmark results for bb0b496 against 13ecfbb are in! @david-christiansen

mathlib-docs//instructions: -1.4T (-1.81%)
mathlib-docs//maxrss: -4MiB (-0.07%)
mathlib-docs//task-clock: +18m 12s (+8.11%)
mathlib-docs//wall-clock: +3m 44s (+75.63%)
own-docs//instructions: -53.9G (-1.41%)
own-docs//maxrss: -873MiB (-25.95%)
own-docs//task-clock: +54s (+15.06%)
own-docs//wall-clock: +16s (+9.65%)

No significant changes detected.

Fixes a bug in leanprover#347 that caused multi-library builds to skip generating HTML for some libraries. The issue was that the Lake setup used declaration-data.bmp as its build target. Multiple-library builds would populate the database, but then only the one that won the race would generate its declaration-data.bmp. The others would incorrectly see that it existed and generate no HTML. Now, the same "marker file" approach is used as the database content to indicate that HTML for a given module, library, or package is up to date. This gives them the right traces, and allows declaration-data.bmp to be updated as needed. A regression test is also included.

* fix: don't skip docs in multi-library situations Fixes a bug in #347 that caused multi-library builds to skip generating HTML for some libraries. The issue was that the Lake setup used declaration-data.bmp as its build target. Multiple-library builds would populate the database, but then only the one that won the race would generate its declaration-data.bmp. The others would incorrectly see that it existed and generate no HTML. Now, the same "marker file" approach is used as the database content to indicate that HTML for a given module, library, or package is up to date. This gives them the right traces, and allows declaration-data.bmp to be updated as needed. A regression test is also included. * fix: DB contention without timeout * chore: bump leansqlite and update calls Rather than sending pragmas in strings, it's nicer to use a higher-level API. * Revert "chore: bump leansqlite and update calls" This reverts commit b95beb5. It will be sent in a separate PR.

feat: save documentation info to SQLite databse

7f88ca7

This PR adds a SQLite database that contains all of the documentation info.

Try multiple DBs

9272ece

Experiment to see if slowdown due to database file contention

Try not generating equations in DB

418fa51

try getting Mathlib cache earlier

76667ea

It seems to build modules and not just make docs. Try this.

e429f8e

david-christiansen added 2 commits January 20, 2026 14:22

Revert "Try not generating equations in DB"

c90bd81

This reverts commit 418fa51.

Revert "Try multiple DBs"

e1d2a8b

This reverts commit 9272ece.

david-christiansen added 2 commits January 26, 2026 09:35

feat: serialize Verso docstrings

1d77c52

Extensions are not presently handled, but the fallback data are saved.

chore: render code to HTML via RenderedCode

ec4cf3e

This is the first step towards rendering HTML from the DB instead of directly. The serializable version of CodeWithInfos used here can be saved in the DB. The generated HTML is the same, modulo commit hashes and external URLs.

refactor: save RenderedCode instead of CodeWithInfos

9489cfd

This is preliminary to generating HTML from the database. The output is still unchanged, modulo commit hashes and source URLs.

chore: check differences between HTML output

931fd6a

The scripts indicate that the output is the same, modulo minor differences in automatic linking

david-christiansen added 5 commits February 19, 2026 11:46

better compare

07bdad4

Small improvements

eeb8074

Defensively protect write statements with a mutex

173209f

It's only used single-threadedly today, but I'd hate to find out the hard way that it isn't.

Refactor readdb for parallel structure

ded9698

Comments

beb37c1

delete unused scripts

06dfc9d

david-christiansen marked this pull request as ready for review February 20, 2026 04:33

david-christiansen commented Feb 20, 2026

View reviewed changes

david-christiansen added 3 commits February 20, 2026 07:52

Proofreading

d045303

doc: more informative comment

9eac515

chore: update comment

77b157f

hargoniX approved these changes Feb 20, 2026

View reviewed changes

chore: add benchmark line tracking documentation database size

91b4138

tydeu approved these changes Feb 21, 2026

View reviewed changes

lakefile.lean Outdated Show resolved Hide resolved

lakefile.lean Show resolved Hide resolved

lakefile.lean Outdated Show resolved Hide resolved

david-christiansen and others added 2 commits February 23, 2026 06:38

Update lakefile.lean

110c073

Co-authored-by: Mac Malone <tydeu@hatpress.net>

chore: remove useless mapM and add comments

bb0b496

hargoniX merged commit 4b7b1cc into leanprover:main Feb 23, 2026
1 check passed

david-christiansen mentioned this pull request Feb 24, 2026

fix: don't skip docs in multi-library situations #363

Merged

		@@ -23,8 +23,11 @@ require «UnicodeBasic» from git
		require Cli from git

Comments

Conversation

david-christiansen commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The Database

RenderedCode

Verso Docstrings

HTML Generation and Link Resolution

Future Possibilities

HTML Simplification

Plugin API

Incrementality and Caching

Validation

Results

Review Process

Uh oh!

david-christiansen commented Jan 19, 2026

Uh oh!

leanprover-radar commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Jan 20, 2026

Uh oh!

leanprover-radar commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Jan 20, 2026

Uh oh!

leanprover-radar commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Jan 20, 2026

Uh oh!

leanprover-radar commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Jan 20, 2026

Uh oh!

leanprover-radar commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Jan 20, 2026

Uh oh!

leanprover-radar commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Jan 20, 2026

Uh oh!

david-christiansen commented Jan 20, 2026

Uh oh!

david-christiansen commented Jan 27, 2026

Uh oh!

leanprover-radar commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Jan 27, 2026

Uh oh!

leanprover-radar commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Feb 19, 2026

Uh oh!

leanprover-radar commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Feb 19, 2026

Uh oh!

leanprover-radar commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-christiansen commented Feb 20, 2026

Uh oh!

leanprover-radar commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david-christiansen commented Jan 19, 2026 •

edited

Loading

`RenderedCode`

leanprover-radar commented Jan 19, 2026 •

edited

Loading

david-christiansen commented Jan 19, 2026 •

edited

Loading

leanprover-radar commented Jan 20, 2026 •

edited

Loading

leanprover-radar commented Jan 20, 2026 •

edited

Loading

leanprover-radar commented Jan 20, 2026 •

edited

Loading

leanprover-radar commented Jan 20, 2026 •

edited

Loading

leanprover-radar commented Jan 20, 2026 •

edited

Loading

leanprover-radar commented Jan 27, 2026 •

edited

Loading

leanprover-radar commented Jan 27, 2026 •

edited

Loading

leanprover-radar commented Feb 19, 2026 •

edited

Loading

leanprover-radar commented Feb 19, 2026 •

edited

Loading

leanprover-radar commented Feb 20, 2026 •

edited

Loading

leanprover-radar commented Feb 20, 2026 •

edited

Loading

tydeu left a comment •

edited

Loading

leanprover-radar commented Feb 23, 2026 •

edited

Loading