- 
                Notifications
    
You must be signed in to change notification settings  - Fork 4
 
Refactor scheduler and add tests for ComplieSim deployment #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
484cbf3    to
    551890f      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a big fan! This looks dramatically cleaner. Some small comments, but I'm really enthusiastic about how much more solid things seem to be getting.
| context=[], | ||
| ) | ||
| 
               | 
          ||
| def __init__(self, deploy: "Deploy") -> None: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the quotes here? I'd have thought that it only matters when we're type checking, in which case it's in scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll generally always go for the unquoted version by default. So if I've quoted them there will be a need.
There are generally two reasons for using the quoted version:
- Circular references in imports 
A --> B --> Awhere one of those references if for typing only. So the dependency loop can be broken by using atyping.TYPE_CHECKINGguard on the import and quoting the type. - Typing of class methods that take an object of the class itself. In this case the class can't be used in typing in the method args/return annotations as the interpreter evaluates those before the class itself. Not sure why it's that way, my guess is it's an optimisation thing and making it smarter would require a second pass or something. So in this case the class name would be quoted as well.
 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with both of those things. My question was why the quotes are needed in this particular case.
But I think I've realised: even if the type annotation doesn't do anything at runtime, all of its symbols still need to be in scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checked, it's because the Deploy class module imports the Launcher for typing, creating a circular reference at module level at runtime. The interpreter just errors.
Sort the coverage database paths, while leaving the first path in place. This should mean that the dumped deployment objects are now deterministic. At the moment when comparing dumps often the only difference is the order of these paths, adding unnecessary noise and making it more difficult to check for meaningful differences. Signed-off-by: James McCorrie <[email protected]>
The pydantic model_dump method contains a lot more information, so rename to allow using these in parallel until all the Deploy classes are migrated to pydantic. Signed-off-by: James McCorrie <[email protected]>
Signed-off-by: James McCorrie <[email protected]>
Reduce dependency on sim_cfg by pulling out some of the data and putting it on a new WorkspaceCfg pydantic object. Signed-off-by: James McCorrie <[email protected]>
Add a `new()` static method to allow tests to test against the desired interface. The intention is to convert the deployment classes to Pydantic typed data classes. The deployment classes seem to be builders of fairly simple data classes once the construction process has been completed. The constructors take the data in other formats and then filter and transform to result in a new data class (deployment object). Pydantic data class constructors take in the final data, so we need a function to take the original data and do the transform. Having a `new()` static method is a fairly standard convention in languages such as Rust. For the moment this method just delegates to the existing constructor. The `CompileSim` unittests make use of this new method instead of the constructor directly. Which means they will be able to test the new code as well as the old code for A/B testing, without having to continually modify the tests. Signed-off-by: James McCorrie <[email protected]>
Migrate a dictionary to use sim_cfg names as keys rather than the config object itself, which assumes the object is hashable and creates unnecessary references. Signed-off-by: James McCorrie <[email protected]>
551890f    to
    46e11f4      
    Compare
  
    Signed-off-by: James McCorrie <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoah! Please could you apply the changes requested by the review to the commits they were talking about? Don't forget that we don't use squash-merging.
Also, it would be really helpful if you could respond to a review comment (or add a thumbs up, or not bother saying anything specifically on the comment) BUT NOT MARK IT RESOLVED. Otherwise the poor reviewer has to manually wade through all the resolved comments and look to figure out whether the requested change had actually been made. It's rather inefficient.
With a big long list of comments on a PR of mine, my approach is usually:
- Hack away in Emacs, using force push locally and interactive rebase to address the changes the reviewer suggested.
 - If they are talking nonsense, reply to the review comment in question. ("No! That's a terrible idea!" :-D)
 - (Ideally) run the tests again locally to check that I didn't just break compliation with a typo.
 - Force-push the new version.
 - Respond to the reviewer with a "I think your review is beautiful" (or something more honest)
 - Re-request review.
 
If the changes are small, the reviewer can then hit the "Compare" button next to my force-push to see what I changed. To make that work, it's helpful not to rebase onto master (because then the comparison picks up all of those changes). If I really want to rebase onto master, I'm sometimes really enthusiastic and do it in two stages:
- Rebase onto master.
 - Force-push and leave a comment on the PR saying "No functional change: I've just rebased."
 - Make whatever tweak I was trying to make.
 - Force-push again, and leave a comment on the PR saying "I did something this time"
 
(Of course, I normally forget to do it in that order. You can totally do it the other way around too, and make the tweak first :-D)
| 
           Thanks @rswarbrick Normally I'd fixup the commits where issues have been introduced. In this case though the first couple of commits were also in a separate PR with some urgent bug fixes (to fix the nightly runner which is now using the latest pypi python package - if this passes we can pin the weekly with known good). Unfortunately I'd rebased this PR on that one while debugging earlier in the day, in the mean time the other PR has been merged (nightly seems to be running okay so far), then you reviewed before I'd had chance to rebase on master. Hence requiring the extra commit. Admittedly I was lazy and didn't check all the changes to see if they needed a new commit or not. I can take another look in the morning and see if I can tidy it up a bit.  | 
    
These are partial steps towards the goal of converting all the deployment objects to Pydantic models - which would provide the first complete and clear statically defined interface within the flow. From there it's a lot easier to work backwards and refactor the rest of the flow.
Main steps in this PR:
cov_db_dirsfield order changed semi randomly. Meaning it takes manual review to determine if thejsonfile generated withDVSIM_DEPLOY_DUMP=truechanged due to reordering or an actual regression. Now a sha1 change is enough to warn of a breaking change.model_dumptodumpas the pydantic version contains intermediary fields and working data. While the dump script only exports the fields read by the scheduler and so affect the jobs that run. Once all deployment objects have been refactored into Pydantic then we could probably go back tomodel_dumpor potentially have a single model of deployable Job to feed to the scheduler and vary only in the factory functions to generate them. Fundamentally the output is to the scheduler is little more than a "command to run", and environment variables to set in the subshell.WorkspaceCfgto package up the values that were otherwise being read from theDeployobject. This means theDeployobject doesn't need to be passed around so much. This isn't necessarily the ideal end point and may disappear in later refactoring. However for the moment it's a step towards clearer interfaces and separation between data and behaviour.