Skip to content

Comments

Remove j and add blocking for excf_nl_9c #222

Open
MetBenjaminWent wants to merge 18 commits intoMetOffice:mainfrom
MetBenjaminWent:excf_nl_9c_adjusted
Open

Remove j and add blocking for excf_nl_9c #222
MetBenjaminWent wants to merge 18 commits intoMetOffice:mainfrom
MetBenjaminWent:excf_nl_9c_adjusted

Conversation

@MetBenjaminWent
Copy link
Contributor

@MetBenjaminWent MetBenjaminWent commented Feb 4, 2026

PR Summary

Sci/Tech Reviewer: @christophermaynard
Code Reviewer: @Pierre-siddall

Remove the j loop which is adversely affecting performance in the boundary layer, and improve blocking loops. Also parameterise j.

NOTE : Since parametising j, 2x tests are failing KGO. As noted by the CO, this appears to be a sensitive routine with CCE O2. As advised by the listing files, the compiler is picking some different optimations as a result of all of the changes. Updating them shows they hold, confirming it is not a threading issue related to the change.
Updated KGOs for:

  • lfric_atm_nwp_gal9-C12_ex1a_cce_fast-debug-64bit
  • lfric_coupled_nwp_gal9-C48_ex1a_cce_fast-debug-64bit

Code Quality Checklist

  • I have performed a self-review of my own code
  • My code follows the project's style guidelines
  • Comments have been included that aid understanding and enhance the readability of the code
  • My changes generate no new warnings
  • All automated checks in the CI pipeline have completed successfully

Testing

  • I have tested this change locally, using the LFRic Apps rose-stem suite
  • If any tests fail (rose-stem or CI) the reason is understood and acceptable (e.g. kgo changes)
  • I have added tests to cover new functionality as appropriate (e.g. system tests, unit tests, etc.)
  • Any new tests have been assigned an appropriate amount of compute resource and have been allocated to an appropriate testing group (i.e. the developer tests are for jobs which use a small amount of compute resource and complete in a matter of minutes)

trac.log

Post KGO and Parameteration changes:

Test Suite Results - lfric_apps - excf_nl_9c_adjusted/run10

Suite Information

Item Value
Suite Name excf_nl_9c_adjusted/run10
Suite User benjamin.went
Workflow Start 2026-02-17T09:59:52
Groups Run all
Dependency Reference Main Like
casim MetOffice/casim@2025.12.1 True
jules MetOffice/jules@69aaf4d True
lfric_apps MetBenjaminWent/lfric_apps@excf_nl_9c_adjusted False
lfric_core MetOffice/lfric_core@88533c5 True
moci MetOffice/moci@2025.12.1 True
SimSys_Scripts MetOffice/SimSys_Scripts@2025.12.1 True
socrates MetOffice/socrates@2025.12.1 True
socrates-spectral MetOffice/socrates-spectral@2025.12.1 True
ukca MetOffice/ukca@2025.12.1 True

Task Information

❌ failed tasks - 6
Task State
run_gungho_model_held-suarez-C24_MG_azspice_gnu_fast-debug-64bit failed
run_gungho_model_tidally-locked-earth-C24_MG_azspice_gnu_fast-debug-64bit-crun1 failed
run_lfric_atm_clim_gal9-C12_azspice_gnu_fast-debug-32bit-crun1 failed
run_shallow_water_galewsky_vi-C48_azspice_gnu_fast-debug-64bit failed
run_shallow_water_galewsky_vi_koren-C48_azspice_gnu_fast-debug-64bit failed
run_shallow_water_williamson5-C24_azspice_gnu_fast-debug-64bit failed
image All 6x failures are unrelated timeouts.
✅ succeeded tasks - 1456

Trac log pre paramertisation changes.

Test Suite Results - lfric_apps - excf_nl_9c_adjusted/run7

Suite Information

Item Value
Suite Name excf_nl_9c_adjusted/run7
Suite User benjamin.went
Workflow Start 2026-02-09T15:31:41
Groups Run ex1a_omp_developer
Dependency Reference Main Like
casim MetOffice/casim@2025.12.1 True
jules MetOffice/jules@69aaf4d True
lfric_apps MetBenjaminWent/lfric_apps@excf_nl_9c_adjusted False
lfric_core MetOffice/lfric_core@bbb3d8a True
moci MetOffice/moci@2025.12.1 True
SimSys_Scripts MetOffice/SimSys_Scripts@2025.12.1 True
socrates MetOffice/socrates@2025.12.1 True
socrates-spectral MetOffice/socrates-spectral@2025.12.1 True
ukca MetOffice/ukca@2025.12.1 True

Task Information

✅ succeeded tasks - 51

Test Suite Results - lfric_apps - excf_nl_9c_adjusted/run8

Suite Information

Item Value
Suite Name excf_nl_9c_adjusted/run8
Suite User benjamin.went
Workflow Start 2026-02-10T10:07:59
Groups Run developer
Dependency Reference Main Like
casim MetOffice/casim@2025.12.1 True
jules MetOffice/jules@69aaf4d True
lfric_apps MetBenjaminWent/lfric_apps@excf_nl_9c_adjusted False
lfric_core MetOffice/lfric_core@bbb3d8a True
moci MetOffice/moci@2025.12.1 True
SimSys_Scripts MetOffice/SimSys_Scripts@2025.12.1 True
socrates MetOffice/socrates@2025.12.1 True
socrates-spectral MetOffice/socrates-spectral@2025.12.1 True
ukca MetOffice/ukca@2025.12.1 True

Task Information

❌ failed tasks - 2
Task State
run_gungho_model_robert-moist-smag-BiP100x8-10x10_azspice_gnu_fast-debug-64bit failed
run_gungho_model_skamarock_klemp_gw_p1-BiP75x4-4000x2000_azspice_gnu_fast-debug-64bit failed
✅ succeeded tasks - 1098
  • Failures were timeouts unrelated to the changes to the physics source

Security Considerations

  • I have reviewed my changes for potential security issues
  • Sensitive data is properly handled (if applicable)
  • Authentication and authorisation are properly implemented (if applicable)

Performance Impact

  • Performance of the code has been considered and, if applicable, suitable performance measurements have been conducted

AI Assistance and Attribution

  • Some of the content of this change has been produced with the assistance of Generative AI tool name (e.g., Met Office Github Copilot Enterprise, Github Copilot Personal, ChatGPT GPT-4, etc) and I have followed the Simulation Systems AI policy (including attribution labels)

Documentation

  • Where appropriate I have updated documentation related to this change and confirmed that it builds correctly

PSyclone Approval

  • If you have edited any PSyclone-related code (e.g. PSyKAl-lite, Kernel interface, optimisation scripts, LFRic data structure code) then please contact the TCD Team

Sci/Tech Review

  • I understand this area of code and the changes being added
  • The proposed changes correspond to the pull request description
  • Documentation is sufficient (do documentation papers need updating)
  • Sufficient testing has been completed

(Please alert the code reviewer via a tag when you have approved the SR)

Code Review

  • All dependencies have been resolved
  • Related Issues have been properly linked and addressed
  • CLA compliance has been confirmed
  • Code quality standards have been met
  • Tests are adequate and have passed
  • Documentation is complete and accurate
  • Security considerations have been addressed
  • Performance impact is acceptable

@MetBenjaminWent MetBenjaminWent marked this pull request as ready for review February 10, 2026 12:50
Copy link
Collaborator

@christophermaynard christophermaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are removing the j loop. When OpenMP was over j, it is now over i. Where j was blocked, i is blocked. Where the loop is OpenMP schedule is dynamic the i loop is blocked. The OpenMP clauses remain the same. This is a sensible approach. The only change required is for j to be set to 1 as a parameter for performance. There is also a trivial comment change.

There are a lot of whitespace changes but these are necessary as one loop is removed so the indentation should change.

@@ -765,9 +765,9 @@ subroutine excf_nl_9c ( &
!-----------------------------------------------------------------------
! 0. Calculate top-of-b.l. velocity scales and Prandtl number.
!-----------------------------------------------------------------------

j = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can j be a parameter?

end do
end do
!$OMP end do

!$OMP do SCHEDULE(STATIC)

! Note parallelised over j as k isn't independent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment will need to change to i?

@MetBenjaminWent
Copy link
Contributor Author

Due to KGO failings now, I'm looking at grabbing the listing files to see if they indicate some changes.

@Adrian-Lock
Copy link
Contributor

Due to KGO failings now, I'm looking at grabbing the listing files to see if they indicate some changes.

I've found this subroutine is very sensitive - even just setting a logical switch to a parameter (with the same setting as used to be passed in from um_physics_init_mod, see 1a95f12) broke kgo for me, so I backed off that bit of tidying up - as I wanted my first go at using git-hub to be as simple as possible!

@MetBenjaminWent
Copy link
Contributor Author

Due to KGO failings now, I'm looking at grabbing the listing files to see if they indicate some changes.

I've found this subroutine is very sensitive - even just setting a logical switch to a parameter (with the same setting as used to be passed in from um_physics_init_mod, see 1a95f12) broke kgo for me, so I backed off that bit of tidying up - as I wanted my first go at using git-hub to be as simple as possible!

Yeah that is understandable rolling it back to what does work safely.

I have seen in the past with CCE in other schemes, LFRic's current fast-debug (O2) settings (which are currently slightly harsher than the UM's) over-optimising things, especially when we change the source in ways which may be presenting other optimisations to the compiler, but in turn alters the KGOs.
Blocking over a dynamic loop, which we are expanding in this ticket, has been a very common one for OMP, requiring for example in lsppn, to use only the UM flags instead.

@MetBenjaminWent
Copy link
Contributor Author

MetBenjaminWent commented Feb 16, 2026

Changes are minor, and likely consistent relative to the loss of the loop.
image
image

These however some changes seem alot for nuanced:
image

@MetBenjaminWent
Copy link
Contributor Author

Copy link

@iboutle iboutle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with the KGO changes, but I think the KGO change for gungho_model_baroclinic must be accidental (since excf_nl wouldn't even be called in this run!)

@MetBenjaminWent
Copy link
Contributor Author

Happy with the KGO changes, but I think the KGO change for gungho_model_baroclinic must be accidental (since excf_nl wouldn't even be called in this run!)

Thanks Ian, yeah that seems to be on HoT given the nightlies:
https://cylchub/services/cylc-review/taskjobs?user=umtest&suite=lfric_apps_set-revisions_nightly_2026-02-17%2Frun1&no_fuzzy_time=0&cycles=&tasks=&task_status=expired&task_status=failed&task_status=preparing&task_status=running&task_status=submitted&task_status=submit-failed&task_status=waiting&job_status=&order=time_desc&per_page=300

@MetBenjaminWent MetBenjaminWent added the KGO This PR contains changes to KGO label Feb 17, 2026
@Adrian-Lock
Copy link
Contributor

I'm happy as boundary_layer code owner but can't find anywhere to formally approve at the moment.

@HengistPodd
Copy link

I'm happy as boundary_layer code owner but can't find anywhere to formally approve at the moment.

Ditto for coupling aspects.

Copy link
Contributor

@Adrian-Lock Adrian-Lock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with these changes

@HengistPodd HengistPodd requested review from HengistPodd and removed request for HengistPodd February 19, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

KGO This PR contains changes to KGO

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove j loop in excf_nl_9c

6 participants