-
Notifications
You must be signed in to change notification settings - Fork 11
Description
In a benchmark there are heavy computations that for example takes a few minutes for each module instance; and light computations where each instance takes a fraction of seconds. Currently we have a mechanism to specify it such that heavy computations are submitted as jobs on the cluster and lighter applications will run directly on the node where jobs are submitted.
However here the limitation is that the smaller jobs still have to run on a single node eg the login node and there are limited control over the resource it uses, eg, number of CPU threads, memory (at least some control over memory) and walltime. It would is not very good to run computations on a login node anyways. A possible way out would be to parse the benchmark and use a dedicated compute node for these light jobs where resource usages are still under control; but without the per job queue and thus avoiding most of the interaction (overhead) with the queue system.