Conversation
tommbendall
left a comment
There was a problem hiding this comment.
This is a really good speed-up. I'm happy that this is scientifically equivalent to main and the KGOs are only changing from bit-wise differences in the kernels.
I am generally happy -- the new mixed solver kernels have been literally taken from the existing kernel and split into two parts, making this easy to review.
My main thoughts that aren't captured through my comments on the code:
- Do you prefer to keep the old
apply_mixed_operator_kernel_mod.F90file, which I think is now unused? Or could it be removed? - For the
assemble_w2h_from_w2hb_kernel, there is already a very similar kernel in the core repo: https://github.com/MetOffice/lfric_core/blob/main/components/science/source/kernel/inter_function_space/sci_average_w2b_to_w2_kernel_mod.F90 . I think it could be better to modify the existing kernel to work for both W2H and W2, rather than add a new kernel here? (appreciating that you'd rather not make this a linked ticket, so feel free to refuse) - The timers will conflict with the changes from #68 and #176 -- so I'm suggesting reverting these changes
science/gungho/unit-test/kernel/solver/apply_mixed_wp_operator_kernel_mod_test.pf
Outdated
Show resolved
Hide resolved
science/gungho/unit-test/kernel/solver/apply_mixed_u_operator_kernel_mod_test.pf
Outdated
Show resolved
Hide resolved
| @@ -0,0 +1,86 @@ | |||
| !----------------------------------------------------------------------------- | |||
| ! Copyright (c) 2017, Met Office, on behalf of HMSO and Queen's Printer | |||
There was a problem hiding this comment.
This looks like the old copyright statement
| if ( LPROF ) call start_timing( id, 'mixed_operator' ) | ||
| type(r_solver_field_type) :: y_uv_broken | ||
|
|
||
| if ( subroutine_timers ) call timer('mixed_solver.operator') |
| @@ -418,7 +414,7 @@ contains | |||
| call invoke( inc_X_times_Y(rhs, h_diag) ) | |||
| end if | |||
|
|
|||
| if ( LPROF ) call stop_timing( id, 'mixed_schur_rhs' ) | |||
| if ( subroutine_timers ) call timer('schur_precon.rhs') | |||
|
|
||
| if ( LPROF ) call start_timing( id, 'schur_back_substitute' ) | ||
| type(r_solver_field_type), target :: uvw_norm | ||
| if ( subroutine_timers ) call timer('schur_precon.back_sub') |
| @@ -507,7 +504,7 @@ contains | |||
| state_p => state%get_field_from_position(isol_p) | |||
| call invoke( setval_X(state_p, exner_inc) ) | |||
|
|
|||
| if ( LPROF ) call stop_timing( id, 'schur_back_substitute' ) | |||
| if ( subroutine_timers ) call timer('schur_precon.back_sub') | |||
|
|
||
| if ( LPROF ) call start_timing( id, 'helmholtz_lhs' ) | ||
| if ( subroutine_timers ) call timer('pressure_solver.helmholtz_lhs') |
| @@ -242,7 +241,7 @@ contains | |||
| nullify( w3_mask, w2_mask ) | |||
| end if | |||
| nullify( x_vec, y_vec ) | |||
| if ( LPROF ) call stop_timing( id, 'helmholtz_lhs' ) | |||
| if ( subroutine_timers ) call timer('pressure_solver.helmholtz_lhs') | |||
Co-authored-by: Thomas Bendall <14180399+tommbendall@users.noreply.github.com>
Reverting back to stable version
…_kernel_mod_test.pf Co-authored-by: Thomas Bendall <14180399+tommbendall@users.noreply.github.com>
…kernel_mod_test.pf Co-authored-by: Thomas Bendall <14180399+tommbendall@users.noreply.github.com>
…nel_mod.F90 Co-authored-by: Ricky Wong <141156427+mo-rickywong@users.noreply.github.com>
…rnel_mod.F90 Co-authored-by: Ricky Wong <141156427+mo-rickywong@users.noreply.github.com>
…kernel_mod_test.pf Co-authored-by: Ricky Wong <141156427+mo-rickywong@users.noreply.github.com>
…m_w2hb_kernel_mod_test.pf Co-authored-by: Ricky Wong <141156427+mo-rickywong@users.noreply.github.com>
|
@mo-rickywong back for next code check |
mo-rickywong
left a comment
There was a problem hiding this comment.
Test-suite all green for me
|
@ss421, @DanStoneMO Steve/Dan, so you're aware, I've now approved this PR and it's waiting to get to front of queue, not sure what needs to be done on Jedi side now. |
|
I've done a test run of LFRic-JEDI with this with the aim of generating the new KGOs, but it actually passed without any fails. So there should be no need for a PR from our end. If there are any KGO problems post-merge I'll fix them as needed. |
ss421
left a comment
There was a problem hiding this comment.
Looks like this will have changed results but due to the test tolerance they are not failing presumable because the change is small.
|
Hi All, I tried to generate the kgo for this PR last night but unfortunately I'm seeing a repeatable failure in an lfric_atm comorph test, so I think this will need to go back to Tom to try and solve. Logs are at https://cylchub/services/cylc-review/taskjobs/james.bruten?&suite=t177_apps%2Frun1&no_fuzzy_time=0&tasks=run_lfric_atm_nwp_comorph_dev-C12_azspice_gnu_fast-debug-32bit-crun1 |
I ran this branch with the head of main merge on, seems to be only KGO check failures now, so not sure what was going on with comorph |
|
I've merged in main to the branch and discussed the failing test with @iboutle and we think the best approach is to remove it for now. It appears this test if failing due to bit level changes from this ticket (for example the same test passes on ex1a with cce and gnu compiler and only fails on azspice with gnu). This test and the code causing the failure will be investigated elsewhere. This PR should now be ready for merging onto main |
mo-rickywong
left a comment
There was a problem hiding this comment.
Runs green at head of main after KGO updates made










PR Summary
Sci/Tech Reviewer: @tommbendall
Code Reviewer: @mo-rickywong
Add in a number of solver optimisations.
The principle performance improvement in this pull request is to split the application of the mixed operator into two seperate new kernels.
These changes result in a performance improvement for 2 main reasons
The C224 & C896 lfric atm tests in the test suite were run with these changes giving the following solver times
Code Quality Checklist
Testing
trac.log
These results are from before the KGO update. The failure in the lfric_inputs appears not to be due to this pull request as none of the code changes should be used and is likely one of the occasional lfric_inputs failures that we see
Test Suite Results - lfric_apps - solver_improvements/run6
Suite Information
Task Information
❌ failed tasks - 150
⌛ waiting tasks - 2
Security Considerations
Performance Impact
AI Assistance and Attribution
Documentation
PSyclone Approval
Sci/Tech Review
(Please alert the code reviewer via a tag when you have approved the SR)
Code Review