Skip to content

Experimental: Asynchronous kernels#3402

Open
bendudson wants to merge 1 commit into
nextfrom
lazy-multi-kernels
Open

Experimental: Asynchronous kernels#3402
bendudson wants to merge 1 commit into
nextfrom
lazy-multi-kernels

Conversation

@bendudson

Copy link
Copy Markdown
Contributor

Gather kernels using an eval_into(result, expression) builder pattern. The kernels can be streamed asynchronously or merged into one large kernel.

Gather kernels using an `eval_into(result, expression)` builder
pattern. The kernels can be streamed asynchronously or merged
into one large kernel.
@bendudson bendudson added the work in progress Not ready for merging label Jun 22, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

ddt(vort) = -bracket(phi, vort, bm) + alpha * (nonzonal_phi - nonzonal_n);
// Two kernels can be evaluated asynchronously
eval_into(ddt(n), // Density equation
-bracket(phi, n, bm) + alpha * (nonzonal_phi - nonzonal_n)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no header providing "bracket" is directly included [misc-include-cleaner]

              -bracket(phi, n, bm) + alpha * (nonzonal_phi - nonzonal_n)
               ^

Comment thread include/bout/fieldops.hxx
#include <type_traits>
#include <utility>
#include <vector>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: included header vector is not used directly [misc-include-cleaner]

Suggested change

Comment thread include/bout/fieldops.hxx
Comment on lines +334 to +339
template <typename ExprView>
void launchExprView(BoutReal* out, const ExprView& expr_view
#if BOUT_HAS_CUDA && defined(__CUDACC__)
,
cudaStream_t stream
#endif

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can the whole function be in the preprocessor guards, rather than splitting the function arguments like this?

Comment thread include/bout/fieldops.hxx
Comment on lines +582 to +588
template <typename Result, typename Expr>
auto eval_into(Result& result, Expr&& expr) && {
using ExprType = std::decay_t<Expr>;
static_assert(bout::detail::is_eval_result_v<Result>,
"eval_into only supports Field2D, Field3D, and FieldPerp results");
static_assert(bout::detail::is_eval_compatible_v<Result, ExprType>,
"eval_into result type does not match the expression family");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to use concepts here to make this clearer, I think?

Comment thread include/bout/fieldops.hxx
Comment on lines +386 to +401
template <typename T>
inline constexpr bool is_eval_result_v =
std::is_same_v<std::decay_t<T>, Field2D> || std::is_same_v<std::decay_t<T>, Field3D>
|| std::is_same_v<std::decay_t<T>, FieldPerp>;

template <typename Result, typename Expr>
inline constexpr bool is_eval_compatible_v =
(std::is_same_v<std::decay_t<Result>, Field3D> && is_expr_field3d_v<Expr>)
|| (std::is_same_v<std::decay_t<Result>, Field2D> && is_expr_field2d_v<Expr>)
|| (std::is_same_v<std::decay_t<Result>, FieldPerp> && is_expr_fieldperp_v<Expr>);

template <typename Expr>
inline constexpr bool is_materialized_eval_expr_v =
std::is_same_v<std::decay_t<Expr>, Field3D>
|| std::is_same_v<std::decay_t<Expr>, Field2D>
|| std::is_same_v<std::decay_t<Expr>, FieldPerp>;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have like is_Field etc in traits.hxx -- are these different?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

work in progress Not ready for merging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants