Skip to content

Continuation script for Arc4 needed #50

@cemac-ccs

Description

@cemac-ccs

For PBS systems (and in the forthcoming archer2 changes for slurm systems also), there is a continuation script in the monc/misc folder that contains running scripts for continuation jobs. These make use of dependency chains so that a job can be started after a previous job has completed, using the checkpoint of the previous job.

A similar script can be written for the arc4 sge systems, using the --hold_jid flag which does the same job as the slurm --dependency flag.

The job submission script used by @craigpoku had a function in it which checks for completed jobs as a way of providing this functionality. That and the monc/misc/continuation.sh script would be a good place to start. The relevant code is below:

 --- Checks:

# Check for run completion message in monc output file:
function check_complete() {
  if [ -r "${MONC_OUT}" ] ; then
    grep -q 'Model run complete due to model time' ${MONC_OUT} >& /dev/null
    if [ "${?}" = "0" ] ; then
      echo 'MONC run appears to have completed (exceeded termination time)'
      # Display end time:
      echo "END TIME: $(date)"
      exit 0
    fi
  fi
}
check_complete

# Check for previous checkpoint file:
if [ -r "${MONC_OUT}" ] ; then
  PREV_CKPT_FILE=$(basename $(grep \
                     'Restarted configuration from checkpoint file' \
                     ${MONC_OUT} | egrep -o '[0-9a-zA-Z_/-]+\.nc') \
fi
# Check for most recent existing checkpoint file:
CKPT_FILE=$(basename $(\ls -1v ${CKPT_DIR} | tail -n 1) 2> /dev/null)
# If current chckpoint file is same as previous, give up:
if [ ! -z "${PREV_CKPT_FILE}" ] && [ ! -z "${CKPT_FILE}" ] ; then
  if [ "${PREV_CKPT_FILE}" = "${CKPT_FILE}" ] ; then
    echo "Previous checkpoint file is same as current (${CKPT_FILE})"
    # Display end time:
    echo "END TIME: $(date)"
    exit 1
  fi
fi

# If we have a checkpoint file, restart MONC, else, start from config:
if [ ! -z "${CKPT_FILE}" ] ; then
  MONC_ARGS="--checkpoint=${CKPT_DIR}/${CKPT_FILE}"
else
  MONC_ARGS="--config=${MONC_CONFIG}"
fi

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions