Skip to content

Conversation

@AGonzalezNicolas
Copy link
Contributor

autom_builds.sh will create a matrix of builds with the components and environments specified in "config_autom_builds.in" file.

@kvrigor
Copy link
Member

kvrigor commented Nov 24, 2025

These scripts definitely makes building model combinations much more convenient, but I feel like these aren't within the scope of TSMP2 and more towards the particular CI implementation. Perhaps @s-poll @jjokella and @mvhulten can weigh in here?

@DCaviedesV
Copy link

@kvrigor I see your point. I would nevertheless argue that these script should live where TSMP2 lives. We could host them elsewhere and add complexity and confusion. We could have simply a directory for scripts, or for ci, or both, i.e.,
TSMP2/scripts/ci

Copy link
Member

@s-poll s-poll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the point of creating many model combinations and environments in an easy fashion. Apart from CI, I see few applications where this can be used, so I would suggest to put it in an extra directory. In general, I wonder whether this functionality could be built directly into build_tsmp2 as a function with less code.


section=""

# Robust parser (handles whitespace, casing, CRLF)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a look to the TSMP2 WFE parser is worth it:
https://github.com/HPSCTerrSys/TSMP2_workflow-engine/blob/master/ctl/utils_tsmp2.sh#L109-L148

Some of the functionalities are already in here, but some are not (e.g. allowing comments in the configure file, passing file name of config, etc)

bin_src="bin/${SYSTEMNAME_UPPER}_${model_id}"

bld_dest="bld/${SYSTEMNAME_UPPER}_${model_id}_${env}"
bin_dest="bin/${SYSTEMNAME_UPPER}_${model_id}_${env}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would pass the bin_dest + bld_dest directly to the build_tsmp2.sh command and just rename in case of failure. This would save some lines of code and IMO would improve readability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename of folders bin and bld are necessary for both cases (failure and success), since build_tsmp2.sh names files as ${SYSTEMNAME_UPPER}_${model_id} without detailing the environment. Appending the environment on the name folder avoids replacing an old build (with one environment) with the newer build with different environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that one could used --build_dir and --install_dir option of build_tsmp2.sh. This would allow to use different environments and in case of rebuilding also work directly in the correct build-directory for the environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the docs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to follow a more commonly used naming convention such as .conf or .cfg

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.in is commonly used as input files, e.g. by automake(1). But rather as a template (and maybe usually read in an automated process), so it makes sense if a process or the user copies this file (with all the combinations) to another one, e.g. cp config_autom_builds.in config_autom_builds and edits the latter (or a script checks for the existence of certain components or environments in this file and copies the valid ones to the actual config file).

I did not check if all this makes sense for the purpose at hand, but it is just to illustrate how .in file should probably be used. If it is just a config file that needs editting, .conf is fine as well.

cmd="./build_tsmp2.sh $combo_flags --env env/$env --force_update"
log_summary "$cmd"

if eval "$cmd"; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we try to avoid the `eval here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks straightforward. What kind of problem could occur with eval as used here?

Would a function and/or using $? be better?

Temporary set -v to log the command?

I don't immediately see how it can be improved. Not saying that it could not.

@jjokella
Copy link
Contributor

jjokella commented Nov 25, 2025

My general input:

  • I like the functionality! (I would even try use these scripts for the building in my TSMP-PDAF-testcases repo)
  • in my opinion a dedicated directory in TSMP2 is a good place
  • README should go to the docs as Stefan suggested in his detailed review

@kvrigor
Copy link
Member

kvrigor commented Nov 25, 2025

Still having second thoughts about the implementation. It should be possible to support matrix builds without introducing extra machinery. If build_tsmp2.sh could distinguish arguments from switches (see item#4 here), this becomes possible:

# Alternative way to build a model through xargs
echo "eCLM ICON ParFlow" | xargs ./build_tsmp2.sh --env jsc.2025.gnu.openmpi

# model_combinations.txt contains a list of combinations (one per line)
cat model_combinations.txt | xargs ./build_tsmp2.sh --env jsc.2025.intel.psmpi
cat model_combinations.txt | xargs ./build_tsmp2.sh --env jsc.2025.gnu.openmpi

I think improving the build_tsmp2.sh interface such that it composes well with xargs or parallel solves this problem more elegantly. As a bonus, build_tsmp2.sh gets better in the process and nothing new is introduced 😉.

@AGonzalezNicolas
Copy link
Contributor Author

AGonzalezNicolas commented Nov 25, 2025

@kvrigor That's a good suggestion. If the approach works reliably, it’s fine for me. My main requirement is to have a build tool that consistently supports all component and environment combinations. Until this functionality is fully integrated into build_tsmp2.sh, I will continue using this script, as it already provides that coverage. Looking forward to having this integrated into build_tsmp2.sh for better consistency. :)

@jjokella
Copy link
Contributor

@kvrigor I like your idea in general (update ./build_tsmp2.sh instead of new script).

However, I have also some drawbacks

  • as far as I understand the directory renaming is still needed (as environment is not reflected)
  • the dedicated script provides a good start for further discussion (as we are currently having)

In total, if there is always a clear distinction of tasks, such that the build-script provides the only solution for a single build and the automatic-build-script concentrates only on the extras needed to handle a matrix of builds, then the two-script solution is fine in my opinion.

@kvrigor
Copy link
Member

kvrigor commented Nov 26, 2025

then the two-script solution is fine in my opinion.

That unfortunately adds maintenance overhead especially if one script can perfectly do the same job... for example, by doing the minor change of supporting both --<model> or <model>,

while [[ "$#" -gt 0 ]]; do
  case "${1,,}" in
    -h|--help) help_tsmp2;;
    ...
    --icon|icon) icon=y;;
    --eclm|eclm) eclm=y;;
    --parflow|parflow) parflow=y parflowGPU=n;;
    --parflowgpu|parflowgpu) parflow=y parflowGPU=y;;
    --pdaf|pdaf) pdaf=y;;
    --cosmo|cosmo) cosmo=y;;
    --clm35|clm35) clm35=y;;
    ...
    *) echo "Unknown parameter passed: $1"; exit 1 ;;
  esac
  shift
done

This already becomes possible:

echo "eCLM ICON ParFlow" | xargs ./build_tsmp2.sh --env env/ubuntu.gnu.openmpi --no_update
cat model_combinations.txt | xargs -n 3 ./build_tsmp2.sh --env env/ubuntu.gnu.openmpi --no_update

@s-poll
Copy link
Member

s-poll commented Nov 26, 2025

I like the suggestion by @kvrigor .

This would potentially also allow for more fancy consturctions (but still kind of one-liner) for the solution :)

e.g.

xargs -I {} bash -c 'while read -r models; do
    modelsstring=$(echo $models | tr " " "")
    ./build_tsmp2.sh $models --env "{}" --no_update --build_dir "$sysname_${modelsstring}_{} --install_dir "$sysname_${modelsstring}_{}"
done < models.txt' < envs.txt

(not tested)

@jjokella
Copy link
Contributor

Hm, yeah, in any case, it sounds like Paul's adaption is sensible and could be added! And then if there is good documentation of how the matrix build can be achieved (half of it is already in the comments above I guess) - I am all aboard!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants