leonardo serves as a way to launch compute within the Terra security boundary.
It does so via multiple different cloud hardware virtualization mechanisms, currently leveraging only the Google Cloud Platform (GCP) and Azure .
leonardo supports launching the following services for compute:
- Spark clusters through Google Dataproc
- Virtual machines through Google Compute Engine
- Kubernetes 'apps' through Google Kubernetes Engine
Currently, leonardo supports the launching of custom docker images for Jupyter and Rstudio in virtual machines and
Dataproc. It also supports launching applications in Kubernetes, with a spotlight on Galaxy.
- For more information on APIs, see swagger
- For more information on custom docker images, see the terra-docker repo
- For more information on applications we support in Kubernetes, see the terra-apps repo
- For more information on Galaxy, see the Galaxy Project
It is recommended to consume these APIs and functionality via the Terra UI
We use JIRA instead of the issues page on GitHub. If you would like to see what we are working you can visit our active sprint or our backlog on JIRA. You will need to set up an account to access, but it is open to the public.
Add the leonardo-client to your build. An example for sbt is below:
libraryDependencies += "org.broadinstitute.dsde.workbench" %% "leonardo-client" % "1.3.6-<git hash>"Please be sure to replace the <git hash> with the first 7 characters of the commit hash of the HEAD of develop.
You can find a list of available releases and <git hash>-es from Google Artifact Registry
Example Scala Usage:
import org.broadinstitute.dsde.workbench.client.leonardo.api.RuntimesApi
import org.broadinstitute.dsde.workbench.client.leonardo.ApiClient
import org.broadinstitute.dsde.workbench.client.leonardo.model.GetRuntimeResponse
class LeonardoClient(leonardoBasePath: String) {
private def leonardoApi(accessToken: String): RuntimesApi = {
val apiClient = new ApiClient()
apiClient.setAccessToken(accessToken)
apiClient.setBasePath(leonardoBasePath)
new RuntimesApi(apiClient)
}
def getAzureRuntimeDetails(token: String, workspaceId: String, runtimeName: String): GetRuntimeResponse = {
val leonardoApi = leonardoApi(token)
leonardoApi.getAzureRuntime(workspaceId, runtimeName)
}
}To run leonardo locally, you are going to need the following:
- The
leonardocodebase - The necessary dependencies installed
- A connection to a
leonardodatabase
The following sections take you through those steps in a logical order.
The first step is to get the code.
This will allow you to not only follow this README locally, you will also be able to install setup
the environmental dependencies as well as build leonardo locally.
git clone https://github.com/databiosphere/leonardo.git
cd leonardoAnd as an aside, this repository uses git submodules. You will need to execute the following commands as well:
git submodule update --init --recursiveThe following tools are required to run leonardo:
- vault
- docker
- google-cloud-sdk
- cloud-sql-proxy
- gke-gcloud-auth-plugin
- java
- sbt
- go
Please feel free to install each tool individually as you see fit for your environment, or
you can follow along with this process to get your environment set up.
Tool setup is facilitated through the use of brew. This allows us to have a little consistency
across environments thanks to the Brewfile.lock.json
-
Install
brew./bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" # be sure to update your .[X]profile or .[X]shrc file with the following # this assumes a default install location for `brew` echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> .zprofile
After the
brewinstall, validate that you have a workingbrewinstallation be executingbrew help. Feel free to turn off Anonymous Analytics with the following command:brew analytics off
-
Following the
brew-install, you should be able to install the necessary tools to setup the remaining dependencies.brew bundle
This will install the following resources for you:
- git
- vault
- mysql-client
- docker
- azure-cli
- google-cloud-sdk
- cloud-sql-proxy
- sdkman (to support
javaandsbtenvironment management) - go
NOTE: If you already have
docker Desktopinstalled, you may run into some collisions. You have a couple of options:- Uninstall docker. The instructions are here.
- Accept the failure and the fact that
docker Desktopwill not be managed throughbrew - Both are fully acceptable - 🙂
NOTE: Please note that we lean on
sdkmanto manage our java-based SDKs - specificallyjavaandsbt. If you are managing your own java-based environments in another manner, please feel free to comment outsdkmanbefore executing the command above.NOTE: Ensure that you are running go>=1.20 by running. This is needed to compile
helm -
Following the
brew bundleupdate, you need to do some environment updating of your dot-files. The default MacOS shell useszsh(short explanation here). If you are using another shell, you will need to update your appropriate dot-files for that shell environment.echo ' # make sure the google-cloud-sdk cli is on your path export PATH=$(brew --prefix)/share/google-cloud-sdk/bin:$PATH export PATH=$(brew --prefix)/opt/mysql-client/bin:$PATH # https://broadworkbench.atlassian.net/wiki/spaces/IA/pages/2848063491/Dev+Environment+Setup export SBT_OPTS="-Xmx2G -Xms1G -Dmysql.host=localhost -Dmysql.port=3311 -Duser.timezone=GMT" # various `leonardo` tool-setup variables export VAULT_ADDR="https://clotho.broadinstitute.org:8200" # NOTE: from local/depends.sh - double check variable name: HELM_BUILD_DIR # or is this even needed export HELM_SCALA_SDK_DIR="/Users/pate/workbench/helm-scala-sdk" # cloud-sql-proxy environment variables # feel free to override the defaults export GOOGLE_PROJECT=broad-dsde-dev export CLOUDSQL_ZONE=us-central1 # # also used by `leonardo` export CLOUDSQL_INSTANCE=[INSERT-YOUR-CLOUD-SQL-CONNECTION-NAME-HERE] # `leonardo`-specific environment variables export DB_USER=leonardo export DB_PASSWORD=password #THIS MUST BE AT THE END OF THE FILE FOR SDKMAN TO WORK!!! export SDKMAN_DIR=$(brew --prefix sdkman-cli)/libexec [[ -s "${SDKMAN_DIR}/bin/sdkman-init.sh" ]] && source "${SDKMAN_DIR}/bin/sdkman-init.sh"' >> ~/.zprofile
-
After adding those values to your environment's dot-file, please ensure they are loaded into your environment by either restarting your terminal or
source-ing them into your current session.If you have
javaandsbtinstalled already, you can skip this step. Otherwise, run the following commands to install the version of java and sbt we are currently supporting(see
.sdkmanrcin theleonardo/-directory for version info).sdk env install
At this point,
sdkmanwill have set up yourJAVA_HOMEandSBT_HOMEenvironment variables accordingly. You will have to runsdk enveach time your current working directory isleonardo/. To always use the correctJAVA_HOMEandSBT_HOMEevery time you drop into theleonardo/directory, you can turn onsdkman_auto_env. To do so, please executesdk configand change the configured value ofsdkman_auto_envfromfalsetotrue. -
We need to install one more thing -
gke-gcloud-auth-plugin.
This will also validate that ourgcloud-cli is installed and running appropriately.gcloud auth login gcloud components install gke-gcloud-auth-plugin
To make sure the
gke-gcloud-auth-pluginis installed correctly, please give the following command a try.gke-gcloud-auth-plugin --version
At this point all the third party dependencies have been installed, and the environment variables necessary to support those tools have been set up.
Next up, interacting with the leonardo-repository! - 🤓
-
Run the following command to setup your
gcloud-cli to work with ``gcloud auth application-default login
NOTE: You may need to run
gcloud config set project <PROJECT_ID>if your environment is setup to use a different Google Cloud Project -
Navigate your browser to the Cloud SQL dashboard,
- Select your database's Instance Overview screen by clicking on it's
Instance ID, and then - In the
Connect to this instance-section, copy theConnection name
- Select your database's Instance Overview screen by clicking on it's
-
In the Cloud SQL dashboard for your instance, Reset the passwords for the users ()
- Select your database's Instance Overview screen by clicking on it's
Instance ID, - Select the
Usersoption from the menu on the left, - Select the three vertical dots for the user, and then
Change password
NOTE: You will want to update your environment (.zprofile - see above or locally) with the correct
usernameandpasswordexport CLOUDSQL_INSTANCE=<your cloned db name> # for Leo and CloudSQL proxy export DB_USER=<db username> # for Leo only, not CloudSQL proxy export DB_PASSWORD=<db password> # for Leo only, not CloudSQL proxy
- Select your database's Instance Overview screen by clicking on it's
-
Execute the following command in a terminal window to establish a local connection to the database. Mind that you will need to be connected to the VPN.
cloud-sql-proxy [CLOUD-SQL-CONNECTION-NAME-HERE]
You can add more vars for the CloudSQL proxy container by editing ./local/sqlproxy.env.
You must be connected to the VPN to complete the rest of this process.
Leo needs a copy of the Go Helm library and secrets, files, and env vars stored in k8s.
-
To build the Go Helm library and get k8s resources, run:
./local/depends.sh -y
-
To only build the Go Helm library, run:
./local/depends.sh helm
-
To only get k8s resources, run:
./local/depends.sh configs
By adding entries to ./local/overrides.env, you can override the value of any variable from k8s for Leo.
By adding entries to ./local/unset.env, you can remove variables from k8s for Leo. Applied after retrieving
variables from k8s and before applying overrides.
If you haven't already, add 127.0.0.1 local.dsde-dev.broadinstitute.org to /etc/hosts:
sudo sh -c "echo '127.0.0.1 local.dsde-dev.broadinstitute.org' >> /etc/hosts"-
To run the CloudSQL and Apache proxies, run:
./local/proxies.sh start
-
You can also stop them:
./local/proxies.sh stop
-
Or restart them:
./local/proxies.sh restart
-
If the CloudSQL proxy fails to start with an error like:
Bind for 0.0.0.0:3306 failed: port is already allocatedRun this to find the PID of the process using that port:
sudo lsof -i tcp:3306
And then kill that process:
sudo kill -TERM <pid>
-
If the CloudSQL proxy starts but keeps restarting with an error like:
googleapi: Error 404: The Cloud SQL instance does not exist., instanceDoesNotExistthen add the environment variables directly to local/xqlproxy.env like this:GOOGLE_PROJECT=broad-dsde-dev CLOUDSQL_ZONE=us-central1 CLOUDSQL_INSTANCE=leonardo-mysql-101-0f6e882310af57ac-clone-XXX DB_USER=leonardo DB_PASSWORD=<db password>
Or, figure out why the docker container is not picking up the environment variables from .zprofile
-
Export required env vars as created by
./local/depends.sh. ./http/src/main/resources/rendered/sbt.env.sh -
Call the sbt
http/runtarget:sbt http/run
... or start an sbt shell and go from there:
sbt
-
If you receive the following error while starting up
leonardoCaused by: com.typesafe.config.ConfigException$Missing: merge of leo.conf @ jar:file:/Users/.../leonardo/target/bg-jobs/sbt_10c9d98e/job-1/target/ae6bfc74/85357ed8/http_2.13-da789a092.jar!/leo.conf: 13,reference.conf @ jar:file:/Users/.../leonardo/target/bg-jobs/sbt_10c9d98e/job-1/target/ae6bfc74/85357ed8/http_2.13-da789a092.jar!/reference.conf: 606: No configuration setting found for key 'subEmail'
Then you need to make sure your environment is setup with the necessary
leonardoenvironment variables. Please be sure that you have run the following command in your current terminal session and retry yourleonardo-startup.. ./http/src/main/resources/rendered/sbt.env.sh
If you get an error like
Exception in thread "io-compute-6" java.lang.UnsatisfiedLinkError: Unable to load library 'helm':
...
(mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64')),
...
You are probably on an M1 (arm64) running an amd64 (x86_64) version of Java. You can verify by first finding and setting your JAVA_HOME (e.g. with which java or jenv if present) and then checking the output of
file "${JAVA_HOME}/bin/java
It should read something like
/Library/Java/JavaVirtualMachines/temurin-17.jdk/Contents/Home/bin/java: Mach-O 64-bit executable arm64
Note the Mach-O 64-bit executable arm64. Otherwise, install an arm64 version of Java and try again. Adoptium should work fine.
Status endpoint: https://local.dsde-dev.broadinstitute.org/status
Swagger page: https://local.dsde-dev.broadinstitute.org
- Install the EnvFile plugin
- Install the Scala plugin
- Set up a new
Applicationrun configuration inRun > Edit Configurations:
(You may need to use the "Modify options" dropdown to unlock options like "Environment variables", "EnvFile", and "Add VM options")
4. Determine your Java home
The above configuration will fail to run properly due to missing JAVA_HOME in the environment. Unfortunately, IntelliJ doesn't propagate this to the running app. To figure out what it is, first run the new configuration, and scroll back up to the top of the output. The first line should look like:
Which means that JAVA_HOME should be set to /Library/Java/JavaVirtualMachines/temurin-17.jdk/Contents/Home.
Now you can go back into the run configuration and add it to the "Environment variables" section:
5. Run it!
In order to use the GUI elements to run tests, some runtime configuration template changes are needed:
- Set default ScalaTest runtime configuration options in
Run > Edit Configurations
First, open the template settings:
Then, go to ScalaTest:
Open VM Options (labeled "1" above) and add the JAVA_OPTS from Run Leonardo unit tests, which should end up looking like:
Open Environment variables (labeled "2" above) and uncheck Include system environment variables:
2. Change Scala compiler options in IntelliJ settings
IntelliJ isn't smart enough to set compiler flags differently between the source and test targets. To hack around this, open Settings > Build, Execution, Deployment > Compier > Scala Compiler and select each module. Then uncheck Enable warnings.
NOTE: These changes may revert when you reload the sbt project! Repeat this step to fix tests complaining about warnings that have been turned into errors.
If you get errors after compilation but before the tests run, try deleting your test Runtime Configuration, running git clean -xfd -e .idea to clean project files, redoing dependencies/configs, restarting IntelliJ, and redoing the above steps before rerunning tests.
3. Make sure the local MySQL server is running by following the instructions in Run Leonardo unit tests.
4. Find a test to run and click on the green arrow next to the test to run it normally or using the debugger:
5. Run it!
Once you've rendered the configs, started the CloudSQL proxy, and sourced the env vars required to run Leo, you can connect to your database with:
./local/proxies.sh dbconnect
When you're done, stop sbt (e.g. using Ctrl+C) and stop the proxies:
./local/proxies.sh stop
Ensure docker is running. Spin up MySQL locally:
$ ./docker/run-mysql.sh start leonardo
Note, if you see error like
Warning: Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (113)
Warning: Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (113)
Warning: Using a password on the command line interface can be insecure.
ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (113)
Run docker system prune -a. If the error persists, try restarting your laptop.
Build Leonardo and run all unit tests.
export JAVA_OPTS="-Dheadless=false -Duser.timezone=UTC -Xmx4g -Xss2M -Xms4G"
sbt clean compile "project http" test
You can also run a particular test suite, e.g.
sbt "testOnly *LeoAuthProviderHelperSpec"
or a particular test within a suite, e.g.
sbt "testOnly *LeoPubsubMessageSubscriberSpec -- -z "handle Azure StopRuntimeMessage and stop runtime""
where map is a substring within the test name.
If you made a change to the leonardo Db by adding a changeset xml file, and then adding that file path to the changelog
file, you have to set initWithLiquibase = true in the leonardo.conf file for these changes to be reflected in the unit
tests. Once youare done testing your changes, make sure to switch it back to initWithLiquibase = false, as this can do
some damage if you are running local Leo against Dev!
Once you're done, tear down MySQL.
./docker/run-mysql.sh stop leonardo
Do docker restart leonardo-mysql if you see java.sql.SQLNonTransientConnectionException: Too many connections error
- Running tests against FIAB Checking FIAB mysql (fina password in /etc/leonardo.conf in firecloud_leonardo-app_1 container)
docker exec -it firecloud_leonardo-mysql_1 bash
root@2f5efbd4f138:/# mysql -u leonardo -pLearn more about scalafmt
sbt scalafmtAll
To install git-secrets
brew install git-secrets
To ensure git hooks are run
cp -r hooks/ .git/hooks/
chmod 755 .git/hooks/apply-git-secrets.sh
To build jar and leonardo docker image
./docker/build.sh jar -d build
To build jar and leonardo docker image
and push to repos broadinstitute/leonardo
tagged with git hash
./docker/build.sh jar -d push
Leonardo has custom runners for github actions, as they require more than the default 30GB provisioned by the ubuntu-latest Github runners
There are 3 nodes, you can view them here: https://github.com/DataBiosphere/leonardo/settings/actions/runners. They have 100GB currently. Devops can be contacted to increase the size if needed, but we only need ~60GB at time of writing.
