@@ -10,21 +10,20 @@ In Flux's convention, the target is the last argumemt:
1010loss (ŷ, y)
1111``` 
1212
13- All loss functions have a method which takes the model as the first argument, and calculates the prediction ` ŷ = model(x) ` .
13+ All loss functions in Flux  have a method which takes the model as the first argument, and calculates the prediction ` ŷ = model(x) ` .
1414This is convenient for [ ` train! ` ] (@ref   Flux.train)` (loss, model, [(x,y), (x2,y2), ...], opt) ` :
1515
1616``` julia 
1717loss (model, x, y) =  loss (model (x), y)
1818``` 
1919
20- Most loss functions in Flux have an optional argument ` agg ` , denoting the type of aggregation performed over the
21- batch:
20+ Most loss functions in Flux have an optional keyword argument ` agg ` , which is the aggregation function used over the batch:
2221
2322``` julia 
24- loss (ŷ, y)                         #  defaults to `mean`
25- loss (ŷ, y,  agg= sum)                #  use `sum` instead
26- loss (ŷ, y,  agg= x-> mean (w .*  x))    #  weighted mean
27- loss (ŷ, y,  agg= x-> sum (x, dims= 2 ))  #  partial reduction, returns an array
23+ loss (ŷ, y)                            #  defaults to `Statistics. mean`
24+ loss (ŷ, y;  agg  =   sum)                #  use `sum` instead
25+ loss (ŷ, y;  agg  =   x-> mean (w .*  x))    #  weighted mean
26+ loss (ŷ, y;  agg  =   x-> sum (x, dims= 2 ))  #  partial reduction, returns an array
2827``` 
2928
3029### Function listing  
0 commit comments