- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.3k
Remote executors
Note: This is documentation for an experimental feature which is under active development, it should not be used in production environments.
dvc machine provides a set of DVC commands for provisioning and managing remote machines which will eventually be used for executing DVC experiments.
Currently dvc machine implementation utilizes https://github.com/iterative/terraform-provider-iterative and requires the terraform client be installed and available in your PATH.
- 
(Optional) Download & install terraform client for your platform 
- 
(Optional) Install latest tpi from master(pip install -e)
- 
Install DVC deps (preferably using pip install -efrommaster:pip install dvc[terraform]- This will install tpi from pypi if you did not already install it from source
 
Note: If you do not install a terraform client yourself, it will be downloaded and installed for you (via tpi)
- Enable the dvc machinefeature (either per-repo or globally):
dvc config [--global] feature.machine true
Machines are configured similarly to DVC remotes, and configuration usage generally mirrors dvc remote add/modify/remove.
- 
dvc machine add- adds a machine to your repo configuration (note that no machine instance will actually be created untildvc machine createis run).
- 
dvc machine modify- modify the configuration for an existing machine. For a full list of available options, refer to the documentation for https://github.com/iterative/terraform-provider-iterative#machine
- 
dvc machine list- List the configuration of one/all machines.
- 
dvc machine remove- removes a machine from your repo configuration (note that any running machine instances should be destroyed withdvc machine destroybefore removing the machine from your repo configuration.
- 
dvc machine rename- Rename a machine to a new name, will also affect the instances related to this machine.
- 
dvc machine create- create and start an instance of a configured machine.
- 
dvc machine status- List the running status of the instances from one specified or all machines.
- 
dvc machine destroy- stop and destroy a previously created machine instance.
- 
dvc machine ssh- connect to a machine via SSH.- Your default sshclient will be used if available in your PATH.
- Otherwise a limited functionality client session will be provided via asyncssh- Note that interactive programs (particularly line editors likevi) may not work as expected when run in this shell session.
 
- Your default 
- Very basic exp execution can be done over SSH via dvc exp run --machine <machine_name>(see also: https://github.com/iterative/dvc/pull/7173).
- Runtime execution environment for the remote machine can be configured via the setup_scriptmachine configuration option.- 
setup_scriptshould be a shell script, and will be sourced from the root of the user's Git repository prior to running an experiment (i.e. it is sourced before executingdvc exp run).
- Note that this is separate from the startup_scriptterraform configuration, which is executed at boot time and meant for installing system packages.
 
- 
- Detached/unattended execution is not currently supported, killing or interrupting the dvc exp run --machinecommand will also terminate the exp execution on the remote machine.
- Also note that the default iterative-machineimage uses Ubuntu 18.04 and Python 3.6 as the system python, which is not supported in DVC. The defaultstartup_scriptalso installs DVC from the latest.debpackage, which will not include the latest changes/fixes related todvc machineanddvc exp run --machine. It is recommended to override the defaultstartup_scriptto install a more recent Python and to install DVC from source, rather than the.debpackage.- Overridden startup scripts should end by generating the file /var/log/dvc-machine-init.log(it can be an empty file). This is used by DVC as a signal that the startup script has completed execution (sinceiterative-machinedoes not provide a built-in way to do this).
 
- Overridden startup scripts should end by generating the file 
Example .dvc/config:
['machine "aws-test"']
    cloud = aws
    startup_script = ../startup.sh
    setup_script = ../env-setup.sh
Example startup.sh (run at machine boot time):
#!/bin/bash
# Install latest python3.9 + pip from deadsnakes PPA
# NOTE: deadsnakes PPA python requires debian/ubuntu system python3-pip (rather than separate PPA python3.x-pip)
sudo add-apt-repository --yes ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install --yes python3.9 python3.9-dev python3.9-venv python3-pip
sudo -u ubuntu python3.9 -m pip install --upgrade pip --user
sudo -u ubuntu python3.9 -m pip install --upgrade setuptools --user
# Install DVC from source
sudo -u ubuntu python3.9 -m pip install "git+https://github.com/iterative/dvc.git#egg=dvc[all]" --user
# Write signal/log file
sudo echo "OK" > /var/log/dvc-machine-init.logExample env-setup.sh (sourced at exp runtime):
#!/bin/bash
python3.9 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -r src/requirements.txtTo run on remote machine:
$ dvc machine create aws-test
$ dvc exp run --machine aws-test
$ dvc machine destroy aws-test