-
-
Notifications
You must be signed in to change notification settings - Fork 25
Mark OptimiserChain as @functor and improve inference for apply!
#115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… `apply!(o::OptimiserChain, ...)`
ToucheSir
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty reasonable to me. I recall we had some discussion about OptimiserChain's using a vector for state instead of a tuple, but not the conclusion. @darsnack might know.
| julia> st2 = Optimisers.setup(OptimiserChain(ClipGrad(), Adam()), m) | ||
| (vec = Leaf(OptimiserChain(ClipGrad{Float32}(10.0), Adam{Float32}(0.001, (0.9, 0.999), 1.19209f-7)), [nothing, (Float32[0.0, 0.0], Float32[0.0, 0.0], (0.9, 0.999))]), fun = nothing) | ||
| (vec = Leaf(OptimiserChain(ClipGrad{Float32}(10.0), Adam{Float32}(0.001, (0.9, 0.999), 1.19209f-7)), (nothing, (Float32[0.0, 0.0], Float32[0.0, 0.0], (0.9, 0.999)))), fun = ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't have expected fun = nothing to change to fun = () with this PR. Is this the output you see locally? If so, can you change this codeblock to a jldoctest so that we catch changes like this in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed by #106 I think.
I guess we just need to remember the syntax for naming a doctest block, so that this one uses m, st from the one before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that change does not arise from this PR, sorry for the confusion. I changed nothing to () for consistency with the doctest above it; I can lookup the syntax to make that block into a proper doctest, as well @mcabbott.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now.
mcabbott
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do this, I think.
I noticed that
OptimiserChainwas not marked@functor; this PR marks it as such to allow one to modify the rules internal toOptimiserChainviafmap.I also modified the optimiser chain state to be a
Tupleinstead of anAbstractArrayfor better type inference inapply!(o::OptimiserChain, ...). If this is considered breaking/undesired I can remove it from the PR. Example of the improved inference:Before:
After: