Skip to content

Conversation

@machshev
Copy link
Collaborator

@machshev machshev commented Oct 29, 2025

These are partial steps towards the goal of converting all the deployment objects to Pydantic models - which would provide the first complete and clear statically defined interface within the flow. From there it's a lot easier to work backwards and refactor the rest of the flow.

Main steps in this PR:

  • Make all the deployment objects attributes deterministic - prior to this change, the cov_db_dirs field order changed semi randomly. Meaning it takes manual review to determine if the json file generated with DVSIM_DEPLOY_DUMP=true changed due to reordering or an actual regression. Now a sha1 change is enough to warn of a breaking change.
  • Renamed model_dump to dump as the pydantic version contains intermediary fields and working data. While the dump script only exports the fields read by the scheduler and so affect the jobs that run. Once all deployment objects have been refactored into Pydantic then we could probably go back to model_dump or potentially have a single model of deployable Job to feed to the scheduler and vary only in the factory functions to generate them. Fundamentally the output is to the scheduler is little more than a "command to run", and environment variables to set in the subshell.
  • Add WorkspaceCfg to package up the values that were otherwise being read from the Deploy object. This means the Deploy object doesn't need to be passed around so much. This isn't necessarily the ideal end point and may disappear in later refactoring. However for the moment it's a step towards clearer interfaces and separation between data and behaviour.
  • Initial CompileSim unit tests - capturing the construction of the key deployment outputs consumed by the scheduler. This is to provide a harness for later refactoring.
  • Various small non functional improvements in the LSF launcher

@machshev machshev force-pushed the refactor_scheduler branch 3 times, most recently from 484cbf3 to 551890f Compare October 30, 2025 15:23
Copy link
Contributor

@rswarbrick rswarbrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a big fan! This looks dramatically cleaner. Some small comments, but I'm really enthusiastic about how much more solid things seem to be getting.

context=[],
)

def __init__(self, deploy: "Deploy") -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the quotes here? I'd have thought that it only matters when we're type checking, in which case it's in scope?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll generally always go for the unquoted version by default. So if I've quoted them there will be a need.
There are generally two reasons for using the quoted version:

  • Circular references in imports A --> B --> A where one of those references if for typing only. So the dependency loop can be broken by using a typing.TYPE_CHECKING guard on the import and quoting the type.
  • Typing of class methods that take an object of the class itself. In this case the class can't be used in typing in the method args/return annotations as the interpreter evaluates those before the class itself. Not sure why it's that way, my guess is it's an optimisation thing and making it smarter would require a second pass or something. So in this case the class name would be quoted as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both of those things. My question was why the quotes are needed in this particular case.

But I think I've realised: even if the type annotation doesn't do anything at runtime, all of its symbols still need to be in scope.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked, it's because the Deploy class module imports the Launcher for typing, creating a circular reference at module level at runtime. The interpreter just errors.

Sort the coverage database paths, while leaving the first path in place.
This should mean that the dumped deployment objects are now
deterministic. At the moment when comparing dumps often the only
difference is the order of these paths, adding unnecessary noise and
making it more difficult to check for meaningful differences.

Signed-off-by: James McCorrie <[email protected]>
The pydantic model_dump method contains a lot more information, so
rename to allow using these in parallel until all the Deploy classes
are migrated to pydantic.

Signed-off-by: James McCorrie <[email protected]>
Reduce dependency on sim_cfg by pulling out some of the data and putting
it on a new WorkspaceCfg pydantic object.

Signed-off-by: James McCorrie <[email protected]>
Add a `new()` static method to allow tests to test against the desired
interface. The intention is to convert the deployment classes to
Pydantic typed data classes. The deployment classes seem to be builders
of fairly simple data classes once the construction process has been
completed. The constructors take the data in other formats and then
filter and transform to result in a new data class (deployment object).

Pydantic data class constructors take in the final data, so we need a
function to take the original data and do the transform. Having a
`new()` static method is a fairly standard convention in languages such
as Rust. For the moment this method just delegates to the existing
constructor. The `CompileSim` unittests make use of this new method
instead of the constructor directly. Which means they will be able to
test the new code as well as the old code for A/B testing, without
having to continually modify the tests.

Signed-off-by: James McCorrie <[email protected]>
Migrate a dictionary to use sim_cfg names as keys rather than the config
object itself, which assumes the object is hashable and creates
unnecessary references.

Signed-off-by: James McCorrie <[email protected]>
@machshev machshev requested a review from rswarbrick October 30, 2025 17:47
Copy link
Contributor

@rswarbrick rswarbrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoah! Please could you apply the changes requested by the review to the commits they were talking about? Don't forget that we don't use squash-merging.

Also, it would be really helpful if you could respond to a review comment (or add a thumbs up, or not bother saying anything specifically on the comment) BUT NOT MARK IT RESOLVED. Otherwise the poor reviewer has to manually wade through all the resolved comments and look to figure out whether the requested change had actually been made. It's rather inefficient.

With a big long list of comments on a PR of mine, my approach is usually:

  • Hack away in Emacs, using force push locally and interactive rebase to address the changes the reviewer suggested.
  • If they are talking nonsense, reply to the review comment in question. ("No! That's a terrible idea!" :-D)
  • (Ideally) run the tests again locally to check that I didn't just break compliation with a typo.
  • Force-push the new version.
  • Respond to the reviewer with a "I think your review is beautiful" (or something more honest)
  • Re-request review.

If the changes are small, the reviewer can then hit the "Compare" button next to my force-push to see what I changed. To make that work, it's helpful not to rebase onto master (because then the comparison picks up all of those changes). If I really want to rebase onto master, I'm sometimes really enthusiastic and do it in two stages:

  • Rebase onto master.
  • Force-push and leave a comment on the PR saying "No functional change: I've just rebased."
  • Make whatever tweak I was trying to make.
  • Force-push again, and leave a comment on the PR saying "I did something this time"

(Of course, I normally forget to do it in that order. You can totally do it the other way around too, and make the tweak first :-D)

@machshev
Copy link
Collaborator Author

Thanks @rswarbrick
Sure I'll try and remember to leave comments unresolved in the future. I've tended to use it as a checklist to remember which issues I've addressed. Closing the trivial changes (spelling changes and other recommendations just applied) and leaving issues where there is discussion.

Normally I'd fixup the commits where issues have been introduced. In this case though the first couple of commits were also in a separate PR with some urgent bug fixes (to fix the nightly runner which is now using the latest pypi python package - if this passes we can pin the weekly with known good). Unfortunately I'd rebased this PR on that one while debugging earlier in the day, in the mean time the other PR has been merged (nightly seems to be running okay so far), then you reviewed before I'd had chance to rebase on master. Hence requiring the extra commit.

Admittedly I was lazy and didn't check all the changes to see if they needed a new commit or not. I can take another look in the morning and see if I can tidy it up a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants