Skip to content

Commit bcd7559

Browse files
Clarify checkpoint semantics and make evolution more deterministic.
1 parent 7952026 commit bcd7559

File tree

8 files changed

+222
-36
lines changed

8 files changed

+222
-36
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
4848
without changing the public API.
4949

5050
### Fixed
51+
- **Add-node mutation bias behavior**: Newly inserted nodes created by `mutate_add_node` now start with zero bias so that splitting a connection is as neutral as possible with respect to the original signal flow. This makes the structural mutation less disruptive while preserving the existing weight-preserving semantics (incoming weight 1.0, outgoing weight equal to the original connection).
52+
- **Checkpoint Generation Semantics**: Clarified and corrected how checkpoint generation numbers are labeled and interpreted.
53+
- A checkpoint file named `neat-checkpoint-N` now always contains the population, species state, and RNG state needed to begin evaluating **generation `N`**.
54+
- Previously, checkpoints were labeled with the index of the generation that had just been evaluated, while storing the *next* generation's population; this could make restored runs appear to "repeat" the previous generation.
55+
- The NEAT evolution loop and genetic algorithm behavior are unchanged; this is a bookkeeping fix that aligns checkpoint behavior with user expectations and the original NEAT paper's generational model.
56+
- New and updated tests in `tests/test_checkpoint.py` and `tests/test_population.py` enforce the invariant that checkpoint `N` resumes at the start of generation `N`.
5157
- **Population Size Drift**: Fixed small mismatches between actual population size and configured `pop_size`
5258
- `DefaultReproduction.reproduce()` now strictly enforces `len(population) == config.pop_size` for every non-extinction generation
5359
- New `_adjust_spawn_exact` helper adjusts per-species spawn counts after `compute_spawn()` to correct rounding/clamping drift

docs/reproducibility.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,13 @@ Saving Checkpoints
151151
# Run evolution
152152
pop.run(eval_genomes, 50)
153153
154+
In this example, checkpoints named ``neat-checkpoint-5``, ``neat-checkpoint-10``,
155+
``neat-checkpoint-15``, ... will be created. The numeric suffix ``N`` always
156+
refers to the **next generation to be evaluated**. In other words, a
157+
checkpoint labeled ``neat-checkpoint-25`` contains the population and species
158+
state for generation 25 at the point just before its fitness evaluation
159+
begins.
160+
154161
Restoring Checkpoints
155162
^^^^^^^^^^^^^^^^^^^^^^
156163

neat/checkpoint.py

Lines changed: 35 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,13 @@ def __init__(self, generation_interval, time_interval_seconds=None,
2121
Saves the current state (at the end of a generation) every ``generation_interval`` generations or
2222
``time_interval_seconds``, whichever happens first.
2323
24-
:param generation_interval: If not None, maximum number of generations between save intervals
24+
The checkpoint filename suffix (for example, ``neat-checkpoint-10``) always refers to the
25+
**next generation to be evaluated**. In other words, a checkpoint created with suffix ``N``
26+
contains the population and species state for generation ``N`` at the point just before
27+
its fitness evaluation begins.
28+
29+
:param generation_interval: If not None, maximum number of generations between save intervals,
30+
measured in generations-to-be-evaluated
2531
:type generation_interval: int or None
2632
:param time_interval_seconds: If not None, maximum number of seconds between checkpoint attempts
2733
:type time_interval_seconds: float or None
@@ -32,28 +38,51 @@ def __init__(self, generation_interval, time_interval_seconds=None,
3238
self.filename_prefix = filename_prefix
3339

3440
self.current_generation = None
35-
self.last_generation_checkpoint = -1
41+
# Tracks the most recent generation index for which a checkpoint was created.
42+
# This value is interpreted as the next generation to be evaluated when the
43+
# checkpoint is restored (see above).
44+
self.last_generation_checkpoint = 0
3645
self.last_time_checkpoint = time.time()
3746

3847
def start_generation(self, generation):
48+
"""Record the index of the generation that is about to be evaluated.
49+
50+
Note that at the time :meth:`end_generation` is called for generation ``g``,
51+
the population and species that are passed in already correspond to the
52+
*next* generation (``g + 1``). This reporter therefore uses ``g + 1`` as
53+
the generation index stored in checkpoints, so that restoring a
54+
checkpoint labeled ``N`` always resumes at the beginning of generation
55+
``N``.
56+
"""
3957
self.current_generation = generation
4058

4159
def end_generation(self, config, population, species_set):
60+
"""Potentially save a checkpoint at the end of a generation.
61+
62+
The ``population`` and ``species_set`` arguments contain the state for
63+
the next generation to be evaluated, whose index is
64+
``self.current_generation + 1``.
65+
"""
4266
checkpoint_due = False
4367

4468
if self.time_interval_seconds is not None:
4569
dt = time.time() - self.last_time_checkpoint
4670
if dt >= self.time_interval_seconds:
4771
checkpoint_due = True
4872

49-
if (checkpoint_due is False) and (self.generation_interval is not None):
50-
dg = self.current_generation - self.last_generation_checkpoint
73+
# The generation whose population is being saved.
74+
next_generation = self.current_generation + 1
75+
76+
if (not checkpoint_due) and (self.generation_interval is not None):
77+
# Compare the upcoming generation index against the last checkpointed
78+
# generation index to decide whether a new checkpoint is due.
79+
dg = next_generation - self.last_generation_checkpoint
5180
if dg >= self.generation_interval:
5281
checkpoint_due = True
5382

5483
if checkpoint_due:
55-
self.save_checkpoint(config, population, species_set, self.current_generation)
56-
self.last_generation_checkpoint = self.current_generation
84+
self.save_checkpoint(config, population, species_set, next_generation)
85+
self.last_generation_checkpoint = next_generation
5786
self.last_time_checkpoint = time.time()
5887

5988
def save_checkpoint(self, config, population, species_set, generation):

neat/reproduction.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -236,8 +236,10 @@ def reproduce(self, config, species, pop_size, generation):
236236
s.members = {}
237237
species.species[s.key] = s
238238

239-
# Sort members in order of descending fitness.
240-
old_members.sort(reverse=True, key=lambda x: x[1].fitness)
239+
# Sort members in order of descending fitness, with genome id as a
240+
# deterministic tie-breaker so that ordering (and thus parent
241+
# selection) is reproducible across runs and checkpoint restores.
242+
old_members.sort(reverse=True, key=lambda x: (x[1].fitness, x[0]))
241243

242244
# Transfer elites to new generation.
243245
if self.reproduction_config.elitism > 0:

neat/species.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -79,11 +79,15 @@ def speciate(self, config, population, generation):
7979
compatibility_threshold = self.species_set_config.compatibility_threshold
8080

8181
# Find the best representatives for each existing species.
82-
unspeciated = set(population)
82+
# Use a deterministic ordering for unspeciated genomes so that
83+
# speciation is reproducible across runs and checkpoint restores.
84+
unspeciated = list(sorted(population.keys()))
8385
distances = GenomeDistanceCache(config.genome_config)
8486
new_representatives = {}
8587
new_members = {}
86-
for sid, s in self.species.items():
88+
# Iterate species in deterministic id order.
89+
for sid in sorted(self.species.keys()):
90+
s = self.species[sid]
8791
candidates = []
8892
for gid in unspeciated:
8993
g = population[gid]
@@ -98,8 +102,9 @@ def speciate(self, config, population, generation):
98102
unspeciated.remove(new_rid)
99103

100104
# Partition population into species based on genetic similarity.
105+
# Iterate remaining genomes in ascending id order for determinism.
101106
while unspeciated:
102-
gid = unspeciated.pop()
107+
gid = unspeciated.pop(0)
103108
g = population[gid]
104109

105110
# Find the species with the most similar representative.
@@ -122,7 +127,8 @@ def speciate(self, config, population, generation):
122127

123128
# Update species collection based on new speciation.
124129
self.genome_to_species = {}
125-
for sid, rid in new_representatives.items():
130+
for sid in sorted(new_representatives.keys()):
131+
rid = new_representatives[sid]
126132
s = self.species.get(sid)
127133
if s is None:
128134
s = Species(sid, generation)

neat/stagnation.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,11 @@ def update(self, species_set, generation):
4040
returns a list with stagnant species marked for removal.
4141
"""
4242
species_data = []
43-
for sid, s in species_set.species.items():
43+
# Iterate species in a deterministic order (by species id) so that
44+
# stagnation decisions are reproducible across runs and checkpoint
45+
# restores, independent of dictionary insertion order.
46+
for sid in sorted(species_set.species.keys()):
47+
s = species_set.species[sid]
4448
if s.fitness_history:
4549
prev_fitness = max(s.fitness_history)
4650
else:

0 commit comments

Comments
 (0)