Skip to content

Reduce build times in CUDA and C++ for complex processes (split kernels and more) #348

@valassi

Description

@valassi

This is a followup of #346: on ggttggg it is clear that build times start becoming very long again (20 minutes or more, mainly in CUDA, but also in clang/C++ the situation looks bad).

The issue is clearly related to inlining of FFV functions (hence to their templating in PR #328) and more generally to LTO/RDC/inlining optimizations over very large code bases (#229 et al).

Removing inlining by hand is an option, but small tests I have done in the past were really bad for performance.

The only viable solution is most likely splitting kernels (#310), not only for CUDA but also for C++. Once we have more than 1000 Feynman diagrams as in ggttggg, it makes no sense to do any optimizations across a single calculate_wavefunctions method with O(1k-10k) FFV calls. It looks better, even just for C++ and for build times, to split this into O(1k) functions, one per diagram.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions