@@ -62,8 +62,6 @@ tree formed by the model and update the parameters using the gradients.
6262
6363There is also [ ` Optimisers.update! ` ] ( @ref ) which similarly returns a new model and new state,
6464but is free to mutate arrays within the old one for efficiency.
65- The method of ` apply! ` for each rule is likewise free to mutate arrays within its state;
66- they are defensively copied when this rule is used with ` update ` .
6765(The method of ` apply! ` above is likewise free to mutate arrays within its state;
6866they are defensively copied when this rule is used with ` update ` .)
6967For ` Adam() ` , there are two momenta per parameter, thus ` state ` is about twice the size of ` model ` :
@@ -87,17 +85,18 @@ Yota is another modern automatic differentiation package, an alternative to Zygo
8785
8886Its main function is ` Yota.grad ` , which returns the loss as well as the gradient (like ` Zygote.withgradient ` )
8987but also returns a gradient component for the loss function.
90- To extract what Optimisers.jl needs, you can write ` _, (_, ∇model) = Yota.grad(f, model, data) `
91- or, for the Flux model above:
88+ To extract what Optimisers.jl needs, you can write (for the Flux model above):
9289
9390``` julia
9491using Yota
9592
9693loss, (∇function , ∇model, ∇image) = Yota. grad (model, image) do m, x
97- sum (m (x))
94+ sum (m (x)
9895end ;
99- ```
10096
97+ # Or else, this may save computing ∇image:
98+ loss, (_, ∇model) = grad (m -> sum (m (image)), model);
99+ ```
101100
102101## Usage with [Lux.jl](https://github.com/avik-pal/Lux.jl)
103102
0 commit comments