-
Notifications
You must be signed in to change notification settings - Fork 37
VarNamedTuple, with an application for FastLDF #1150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* Fast Log Density Function * Make it work with AD * Optimise performance for identity VarNames * Mark `get_range_and_linked` as having zero derivative * Update comment * make AD testing / benchmarking use FastLDF * Fix tests * Optimise away `make_evaluate_args_and_kwargs` * const func annotation * Disable benchmarks on non-typed-Metadata-VarInfo * Fix `_evaluate!!` correctly to handle submodels * Actually fix submodel evaluate * Document thoroughly and organise code * Support more VarInfos, make it thread-safe (?) * fix bug in parsing ranges from metadata/VNV * Fix get_param_eltype for TSVI * Disable Enzyme benchmark * Don't override _evaluate!!, that breaks ForwardDiff (sometimes) * Move FastLDF to experimental for now * Fix imports, add tests, etc * More test fixes * Fix imports / tests * Remove AbstractFastEvalContext * Changelog and patch bump * Add correctness tests, fix imports * Concretise parameter vector in tests * Add zero-allocation tests * Add Chairmarks as test dep * Disable allocations tests on multi-threaded * Fast InitContext (#1125) * Make InitContext work with OnlyAccsVarInfo * Do not convert NamedTuple to Dict * remove logging * Enable InitFromPrior and InitFromUniform too * Fix `infer_nested_eltype` invocation * Refactor FastLDF to use InitContext * note init breaking change * fix logjac sign * workaround Mooncake segfault * fix changelog too * Fix get_param_eltype for context stacks * Add a test for threaded observe * Export init * Remove dead code * fix transforms for pathological distributions * Tidy up loads of things * fix typed_identity spelling * fix definition order * Improve docstrings * Remove stray comment * export get_param_eltype (unfortunatley) * Add more comment * Update comment * Remove inlines, fix OAVI docstring * Improve docstrings * Simplify InitFromParams constructor * Replace map(identity, x[:]) with [i for i in x[:]] * Simplify implementation for InitContext/OAVI * Add another model to allocation tests Co-authored-by: Markus Hauru <[email protected]> * Revert removal of dist argument (oops) * Format * Update some outdated bits of FastLDF docstring * remove underscores --------- Co-authored-by: Markus Hauru <[email protected]>
* print output * fix * reenable * add more lines to guide the eye * reorder table * print tgrad / trel as well * forgot this type
Benchmark Report
Computer InformationBenchmark Results |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## breaking #1150 +/- ##
============================================
- Coverage 80.66% 77.61% -3.05%
============================================
Files 41 42 +1
Lines 3878 4154 +276
============================================
+ Hits 3128 3224 +96
- Misses 750 930 +180 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
It looks to me that the 1.11 perf is only a lot worse on the trivial model. In my experience (ran into this exact issue with Enzyme once, see also https://github.com/TuringLang/DynamicPPL.jl/pull/877/files), trivial models with 1 variable can be quite susceptible to changes in inlining strategy. It may be that a judicious |
… and also `bundle_samples` (#1129) * Implement `ParamsWithStats` for `FastLDF` * Add comments * Implement `bundle_samples` for ParamsWithStats -> MCMCChains * Remove redundant comment * don't need Statistics?
* Make FastLDF the default * Add miscellaneous LogDensityProblems tests * Use `init!!` instead of `fast_evaluate!!` * Rename files, rebalance tests
…nked (#1141) * Improve type stability when all parameters are linked or unlinked * fix a merge conflict * fix enzyme gc crash (locally at least) * Fixes from review
| # TODO(mhauru) Might add another specialisation to _compose_no_identity, where if | ||
| # ReshapeTransforms are composed with each other or with a an UnwrapSingeltonTransform, only | ||
| # the latter one would be kept. | ||
| """ | ||
| _compose_no_identity(f, g) | ||
| Like `f ∘ g`, but if `f` or `g` is `identity` it is omitted. | ||
| This helps avoid trivial cases of `ComposedFunction` that would cause unnecessary type | ||
| conflicts. | ||
| """ | ||
| _compose_no_identity(f, g) = f ∘ g | ||
| _compose_no_identity(::typeof(identity), g) = g | ||
| _compose_no_identity(f, ::typeof(identity)) = f | ||
| _compose_no_identity(::typeof(identity), ::typeof(identity)) = identity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been moved here from varnamedvector.jl, verbatim.
|
Performance on v1.11 has been fixed, and many other improvements made. Things that remain unfinished:
This probably isn't ready to merge due to aforementioned limitations, but fixing them will be adding things, rather than modifying things, compared to what is in this PR, so I think now is a good time for a first review. |
penelopeysm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just happened to be reading the docs, will leave the code for next time - but the design sounds good!
|
DynamicPPL.jl documentation for PR #1150 is available at: |
Co-authored-by: Penelope Yong <[email protected]>
|
Oh, forgot to mention: I'm up for reconsidering the name. Given the role of
Opinions welcome. |
| _haskey(arr::AbstractArray, optic::IndexLens) = _haskey(arr, optic.indices) | ||
| _haskey(arr::AbstractArray, inds) = checkbounds(Bool, arr, inds...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| _haskey(arr::AbstractArray, optic::IndexLens) = _haskey(arr, optic.indices) | |
| _haskey(arr::AbstractArray, inds) = checkbounds(Bool, arr, inds...) | |
| _haskey(arr::AbstractArray, optic::IndexLens) = _hasindices(arr, optic.indices) | |
| _hasindices(arr::AbstractArray, inds) = checkbounds(Bool, arr, inds...) |
I would prefer different function names for different signatures!
| data::Array{ElType,num_dims} | ||
| mask::Array{Bool,num_dims} | ||
|
|
||
| function PartialArray( | ||
| data::Array{ElType,num_dims}, mask::Array{Bool,num_dims} | ||
| ) where {ElType,num_dims} | ||
| if size(data) != size(mask) | ||
| throw(ArgumentError("Data and mask arrays must have the same size")) | ||
| end | ||
| return new{ElType,num_dims}(data, mask) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want FixedSizeArrays, or too much faff?
| PartialArray{Int64,2}((1, 2) => 5, (3, 4) => 10) | ||
| ``` | ||
| The optional keywoard argument `min_size` can be used to specify the minimum initial size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The optional keywoard argument `min_size` can be used to specify the minimum initial size. | |
| The optional keyword argument `min_size` can be used to specify the minimum initial size. |
| """Take the minimum size that a dimension of a PartialArray needs to be, and return the size | ||
| we choose it to be. This size will be the smallest possible power of | ||
| PARTIAL_ARRAY_DIM_GROWTH_FACTOR. Growing PartialArrays in big jumps like this helps reduce | ||
| data copying, as resizes aren't needed as often. | ||
| """ | ||
| function _partial_array_dim_size(min_dim) | ||
| factor = PARTIAL_ARRAY_DIM_GROWTH_FACTOR | ||
| return factor^(Int(ceil(log(factor, min_dim)))) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this better for performance? Would it be also equally OK to make it min_dim to begin but still scale by 4 each time, or is it just magically faster whenever the size is always a power of 4?
| resized in exponentially increasing steps. This means that most `setindex!!` calls are very | ||
| fast, but some may incur substantial overhead due to resizing and copying data. It also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although the cost of setindex!! is still O(1) amortised...!
| heterogeneous data under different indices of the same symbol. That is, if one either | ||
| * sets `a[1]` and `a[2]` to be of different types, or | ||
| * sets `a[1].b` and `a[2].c`, without setting `a[1].c`. or `a[2].b`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * sets `a[1].b` and `a[2].c`, without setting `a[1].c`. or `a[2].b`, | |
| * sets `a[1].b` and `a[2].c`, without setting `a[1].c` and `a[2].b`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I think this could be simplified to just
if `a[1]` and `a[2]` both exist, sets `a[1].b` without also setting `a[2].b`
| Convert a `VarName` to an `Accessor` lens, wrapping the first symdol in a `PropertyLens`. | ||
| This is used to simplify method dispatch for `_getindx`, `_setindex!!`, and `_haskey`, by | ||
| considering `VarName`s to just be a special case of lenses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Convert a `VarName` to an `Accessor` lens, wrapping the first symdol in a `PropertyLens`. | |
| This is used to simplify method dispatch for `_getindx`, `_setindex!!`, and `_haskey`, by | |
| considering `VarName`s to just be a special case of lenses. | |
| Convert a `VarName` to an `Accessor` lens, wrapping the first symbol in a `PropertyLens`. | |
| This is used to simplify method dispatch for `_getindex`, `_setindex!!`, and `_haskey`, by | |
| considering `VarName`s to just be a special case of lenses. |
| # return VarNamedTuple(_setindex!!(vnt.data, value, S)) | ||
| # but that seems to be type unstable. Why? Shouldn't it obviously be the same as the | ||
| # below? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe because the symbol S is no longer represented at the type level?
| val_expr = if name in names1 && name in names2 | ||
| :(_merge_recursive(vnt1.data[$(QuoteNode(name))], vnt2.data[$(QuoteNode(name))])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| val_expr = if name in names1 && name in names2 | |
| :(_merge_recursive(vnt1.data[$(QuoteNode(name))], vnt2.data[$(QuoteNode(name))])) | |
| val_expr = if name in names1 && name in names2 | |
| :(_merge_recursive(vnt1.data.$name, vnt2.data.$name)) |
Don't Quote(Node) me on this, but I think this should be fine since both vnt1.data and vnt2.data are NamedTuples. Same for the lines below.
| # TODO(mhauru) Should this return tuples, like it does now? That makes sense for | ||
| # VarNamedTuple itself, but if there is a nested PartialArray the tuple might get very big. | ||
| # Also, this is not very type stable, it fails even in basic cases. A generated function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I would be kind of inclined to just let it return a vector. I don't know if keys is used in performance sensitive aspects.
I decided that rather than take over VarInfo like in #1074, the first use case of VarNamedTuple should be replacing the NamedTuple/Dict combo in FastLDF. That's what this PR does.
This is still work in progress:
Colons inVarNames.However, tests seem to pass, so I'm putting this up. I ran the familiar FastLDF benchmarks from #1132, adapted a bit. Source code:
Results on Julia v1.12:
Same thing but in Julia v1.11:
So on 1.12 all looks good: This is a bit faster than the old version, substantial faster when there are a lot of IndexLenses, as it should. On 1.11 performance is destroyed, probably because type inference fails/gives up, and I need to fix that.
The main point of this PR is not performance, but having a general data structure for storing information keyed by VarNames, so I'm happy as long as performance doesn't degrade. Next up would be using this same data structure for ConditionContext (hoping to fix #1148), ValuesAsInModelAcc, maybe some other Accumulators, InitFromParams, GibbsContext, and finally to implement an AbstractVarInfo type.
I'll update the docs page with more information about what the current design is that I've implemented, but the one sentence summary is that it's nested NamedTuples, and then whenever we meet IndexLenses, it's an Array for the values together with a mask-Array that marks which values are valid values and which are just placeholders.
I think I know how to fix all the current short-comings, except for
Colons inVarNames. Setting a value in a VNT with aColoncould be done, but getting seems ill-defined, at least without providing further information about the size the value should be.cc @penelopeysm, though this isn't ready for reviews yet.