Make training checkpoints loadable under torch.load(weights_only=True) by YiqingChen524 · Pull Request #802 · materialyzeai/matgl

YiqingChen524 · 2026-06-08T07:48:50Z

ModelLightningModule / PotentialLightningModule called save_hyperparameters(ignore=["model"]), so the optimizer and scheduler objects (and, for the Potential module, the numpy element_refs array) were pickled into the checkpoint's hyper_parameters. Since torch 2.6 flipped torch.load's default to weights_only=True, resuming training via Trainer.fit(ckpt_path=...) then fails with an UnpicklingError on those globals.

Exclude optimizer/scheduler from save_hyperparameters (their state is already persisted in the checkpoint's optimizer_states / lr_schedulers, so this loses nothing) and store element_refs as a plain list instead of a numpy array (AtomRef already accepts a list). configure_optimizers is unaffected because it reads the self.optimizer / self.scheduler instance attributes, not hparams.

This PR fixes the issue reported by Chao Yang @Y-Chao.

ModelLightningModule / PotentialLightningModule called save_hyperparameters(ignore=["model"]), so the optimizer and scheduler *objects* (and, for the Potential module, the numpy element_refs array) were pickled into the checkpoint's hyper_parameters. Since torch 2.6 flipped torch.load's default to weights_only=True, resuming training via Trainer.fit(ckpt_path=...) then fails with an UnpicklingError on those globals. Exclude optimizer/scheduler from save_hyperparameters (their state is already persisted in the checkpoint's optimizer_states / lr_schedulers, so this loses nothing) and store element_refs as a plain list instead of a numpy array (AtomRef already accepts a list). configure_optimizers is unaffected because it reads the self.optimizer / self.scheduler instance attributes, not hparams. Adds a regression test asserting the objects are absent from hparams, element_refs is a list, and a checkpoint built from these hparams loads under weights_only=True. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

YiqingChen524 requested review from kenko911 and shyuep as code owners June 8, 2026 07:48

kenko911 added 3 commits June 8, 2026 10:10

Merge branch 'main' into fix/weights-only-safe-checkpoint-hparams

118492e

Merge branch 'main' into fix/weights-only-safe-checkpoint-hparams

9c65f1e

Merge branch 'main' into fix/weights-only-safe-checkpoint-hparams

3353846

kenko911 merged commit 9e4573e into materialyzeai:main Jun 10, 2026
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make training checkpoints loadable under torch.load(weights_only=True)#802

Make training checkpoints loadable under torch.load(weights_only=True)#802
kenko911 merged 4 commits into
materialyzeai:mainfrom
YiqingChen524:fix/weights-only-safe-checkpoint-hparams

YiqingChen524 commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YiqingChen524 commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants