Skip to content

Error with nested calls to TaylorDiff #99

@landreman

Description

@landreman

Hi, I have the following nested AD problem that I'd like to get to work with TaylorDiff. Starting with a vector-valued function f(params, x) for scalar x, take a high-order derivative with respect to x and evaluate for a specific value of x. Then apply some reduction function to the result to obtain a scalar-valued function g(params). Finally I want to evaluate the gradient d g / d params. Example:

# Arbitrary function:
f(params, x) = [params[1] * x^3 + params[2], params[2] * sin(x - params[1]), sqrt(x + params[2])]

function g(params)
    closure(x) = f(params, x)
    some_x = 0.7
    d3f_dx3 = TaylorDiff.derivative(closure, some_x, Val(3))
    return sum(d3f_dx3)
end

some_params = [1.3, 2.1]

@show g(some_params)  # Fine, gives 6.095380076578732
TaylorDiff.derivative(g, some_params, [1.0, 0.0], Val(1))  # First element of the gradient

Results, using julia 1.11.5 and TaylorDiff v0.3.3:

ERROR: MethodError: *(::TaylorScalar{Float64, 1}, ::TaylorScalar{Float64, 3}) is ambiguous.

Candidates:
  *(a::TaylorScalar, b::Number)
    @ TaylorDiff ~/.julia/packages/TaylorDiff/qw5aY/src/primitive.jl:119
  *(a::Number, b::TaylorScalar)
    @ TaylorDiff ~/.julia/packages/TaylorDiff/qw5aY/src/primitive.jl:114

Possible fix, define
  *(::TaylorScalar, ::TaylorScalar)

Stacktrace:
 [1] f(params::Vector{TaylorScalar{Float64, 1}}, x::TaylorScalar{Float64, 3})
   @ Main ./REPL[4]:1
 [2] (::var"#closure#1"{Vector{TaylorScalar{Float64, 1}}})(x::TaylorScalar{Float64, 3})
   @ Main ./REPL[5]:2
 [3] derivatives
   @ ~/.julia/packages/TaylorDiff/qw5aY/src/derivative.jl:41 [inlined]
 [4] derivative
   @ ~/.julia/packages/TaylorDiff/qw5aY/src/derivative.jl:16 [inlined]
 [5] g(params_in::Vector{TaylorScalar{Float64, 1}})
   @ Main ./REPL[5]:4
 [6] derivatives
   @ ~/.julia/packages/TaylorDiff/qw5aY/src/derivative.jl:41 [inlined]
 [7] derivative(f::Function, x::Vector{Float64}, l::Vector{Float64}, p::Val{1})
   @ TaylorDiff ~/.julia/packages/TaylorDiff/qw5aY/src/derivative.jl:17
 [8] top-level scope
   @ REPL[8]:1

Any idea how this could be made to work?

While Zygote-over-TaylorDiff does work for this problem, @btime shows it is much faster to use ForwardDiff-over-ForwardDiff (probably due to the overhead of reverse mode), so I imagine TaylorDiff-over-TaylorDiff (or ForwardDiff-over-TaylorDiff) might be even faster due to the high-order inner derivative. Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions