forked from andypetrella/pipeline
-
Notifications
You must be signed in to change notification settings - Fork 0
Setup Cloud Environment
Chris Fregly edited this page Aug 30, 2016
·
55 revisions
- We no longer support the local laptop installation the large memory and disk footprint of this real-world environment
- A cloud instance is included in the workshop
- You do not need to set up an instance on your own
- The instructions below are provided for people who are setting this up on their own
- While not required, I would recommend choosing Ubuntu 14.04
- Typically, we use either the Amazon Web Services
r3.2xlargeEC2 or Google Cloud Platformn1-highmem-8GCE Cloud Instance types
- 8 Cores
- 50+ GB RAM
- 100 GB Root Volume (MUST BE ROOT VOLUME)
- Currently, these cloud instance types cost around $8-10 per day
- Later, we show how to save money by Stopping your instance - allowing you to resume your work at a later date
- Make sure all ports are open on your cloud instance
- While not secure in any way, we open all ports to make connectivity easier
- For production environments, definitely lock down these ports to the bare minimum
- Below is a screen shot of the FIREWALL RULES CHANGES required to allow all traffic into your instance
- In this example, my instances are using the "default" network which is the Google default
- You must modify these rules or you will only be able to connect to your instance on port 80 (and 443 if selected)
- Below is a screen shot of the SECURITY GROUP CHANGES required to allow all traffic into your instance
- In this example, the instance is using a security group named
fluxcapacitor - You must modify the security group or you will only be able to connect to your instance on port 80 (and 443 if selected)
-
Create SSH Key Pair
-
Result of Associating Key Pair to a Cloud Instance at Creation Time
- NOTE: USE
-gceor-awsaccordingly - Download the
.pemfile - Copy the downloaded
.pemfile to the/Users/<username>/.sshdirectory
mkdir -p ~/.ssh
- Change the permission on the
.pemfile so thatsshdoesn't complain
chmod 600 ~/.ssh/pipeline-training-gce.pem
- Download the
ppkfile - Copy the downloaded file to the
\Users\<username>\.sshdirectory
# TODO: Insert the Windows/DOS Commands to `mkdir` and `copy` the key
# from `\User\<username>\Downloads` to `\Users\<username>\.ssh`
- Username: pipeline-training
- Password: password9 if asked for a password
- Use SSH to log in to your Cloud Instance using the
.pemfile created from the previous step - You may have to enter the password you used when you created the key pair in an earlier step
- NOTE: USE
-gceor-awsaccordingly
# MacOS + Linux ONLY
ssh -i ~/.ssh/pipeline-training-gce.pem pipeline-training@<your-cloud-instance-public-ip>
- Username: pipeline-training
- Password: password9 if asked for a password
- Download Putty
- Use
puttyto connect to using the downloaded.ppkfile
- Do not rely on the Docker installation that comes with your Operating System
- You will need the latest Docker per the script below which should be run at instance creation time
- GCE calls these
Init Scriptsfor GCE Images - AWS calls these
User Datafor AWS AMIs
#!/bin/bash
sudo apt-get install -y curl
sudo curl 'https://bintray.com/user/downloadSubjectPublicKey?username=pcp' | sudo apt-key add -
sudo echo "deb https://dl.bintray.com/pcp/trusty trusty main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update
sudo apt-get install -y pcp pcp-webapi
curl -fsSL https://get.docker.com/ | sh
curl -fsSL https://get.docker.com/gpg | sudo apt-key add -
sudo docker pull fluxcapacitor/pipeline
- If you see the following error
Warning: failed to get default registry endpoint from daemon (Cannot
connect to the Docker daemon. Is the docker daemon running on this
host?). Using system default: https://index.docker.io/v1/
Cannot connect to the Docker daemon. Is the docker daemon running on
this host?
You are likely calling docker <something> instead of sudo docker <something>
- More info in the Docker docs
Environment Setup
Demos
6. Serve Batch Recommendations
8. Streaming Probabilistic Algos
9. TensorFlow Image Classifier
Active Research (Unstable)
15. Kubernetes Docker Spark ML
Managing Environment
15. Stop and Start Environment









