While jobs are running they write outputs to the data_tmp directory, which are copied over to data when they are finished. This is usually fine, but ideally it would be possible to write cmds that resume another command, and that would probably require being able to check that another command has started and is not running.
Ideally, that would look something like this:
cmd('python', 'my_experiment.py', Out(log_dir),
priority=base_priority + (0,))
cmd('python', 'my_experiment.py', Out(log_dir),
'--resume-dir', TmpIn(log_dir),
priority=base_priority + (1,))
When #5 is fixed, doexp will only start the resume command (and not first command) if the log_dir exists in the temporary directory. Theoretically the above example could break if the log_dir is empty (and thus cannot be resumed from), but I think we should leave that up to the user's script to fix.
While jobs are running they write outputs to the
data_tmpdirectory, which are copied over todatawhen they are finished. This is usually fine, but ideally it would be possible to writecmds that resume another command, and that would probably require being able to check that another command has started and is not running.Ideally, that would look something like this:
When #5 is fixed,
doexpwill only start the resume command (and not first command) if the log_dir exists in the temporary directory. Theoretically the above example could break if the log_dir is empty (and thus cannot be resumed from), but I think we should leave that up to the user's script to fix.