terraform README updates

2026-01-07 19:05:42 +03:00 · 2017-05-15 21:17:48 -07:00
parent 9f1933e2c4
commit 0f61923fc4
5 changed files with 110 additions and 35 deletions
--- a/terraform/README.md
+++ b/terraform/README.md
@@ -1,6 +1,9 @@
 # Provision a Nomad cluster with Terraform

-Easily provision a fully functional and integrated HashiCorp sandbox environment in the cloud. The goal is to allow easy exploration of Nomad, including the integrations with Consul and Vault. A number of [examples] (examples/README.md) are included.
+Provision a fully functional Nomad cluster in the cloud with [Packer](https://packer.io) and [Terraform](https://terraform.io). The goal is to allow easy exploration of Nomad, including the integrations with Consul and Vault. To get started, use one of the per cloud provider links below:

-See the README in the [AWS] (aws/README.md) subdirectory to get started. 
+[AWS](aws/README.md)
+Google Cloud (coming soon)
+Microsoft Azure (coming soon)

+A number of [examples](examples/README.md) and guides are also included.
--- a/terraform/aws/README.md
+++ b/terraform/aws/README.md
@@ -20,15 +20,15 @@ You will need the following:
 - [API access keys](http://aws.amazon.com/developers/access-keys/)
 - [SSH key pair](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html)

-Set the following environment variables:
+If you provisioned a Vagrant environment using the included Vagrantfile, you will need to copy your private key to it. If not, you will need to [install Terraform](https://www.terraform.io/intro/getting-started/install.html).
+
+Set environment variables for your AWS credentials:

 ```bash
 $ export AWS_ACCESS_KEY_ID=[ACCESS_KEY_ID]
 $ export AWS_SECRET_ACCESS_KEY=[SECRET_ACCESS_KEY]
 ```

-If you provisioned a Vagrant environment, you will need to copy your private key to it. 
-
 ### Provision

 `cd` to one of the environment subdirectories:
@@ -41,23 +41,23 @@ Update terraform.tfvars with your SSH key name:

 ```bash
 region                  = "us-east-1"
-ami                     = "ami-62a60374"
-instance_type           = "t2.small"
+ami                     = "ami-28a1dd3e"
+instance_type           = "t2.medium"
 key_name                = "KEY"
 key_file                = "/home/vagrant/.ssh/KEY.pem"
-server_count            = "1"
-client_count            = "2"
+server_count            = "3"
+client_count            = "4"
 ```
 For example:

 ```bash
 region                  = "us-east-1"
-ami                     = "ami-62a60374"
+ami                     = "ami-28a1dd3e"
 instance_type           = "t2.medium"
 key_name                = "hashi-us-east-1"
 key_file                = "/home/vagrant/.ssh/hashi-us-east-1.pem"
-server_count            = "1"
-client_count            = "2"
+server_count            = "3"
+client_count            = "4"
 ```

 Provision:
--- a/terraform/aws/packer/README.md
+++ b/terraform/aws/packer/README.md
@@ -0,0 +1,18 @@
+# Build an Amazon machine image
+
+See the pre-requisites listed [here](../aws/README.md). If not, you will need to [install Packer](https://www.packer.io/intro/getting-started/install.html).
+
+Set environment variables for your AWS credentials:
+
+```bash
+$ export AWS_ACCESS_KEY_ID=[ACCESS_KEY_ID]
+$ export AWS_SECRET_ACCESS_KEY=[SECRET_ACCESS_KEY]
+```
+
+Build your AMI:
+
+```bash
+packer build packer.json
+```
+
+Don't forget to copy the AMI Id to your [terraform.tfvars file](../env/us-east/terraform.tfvars).
--- a/terraform/examples/README.md
+++ b/terraform/examples/README.md
@@ -1,6 +1,5 @@
 ## Examples

-The examples included here are designed to introduce specific features and provide a basic learning experience. The examples subdirectory is automatically provisioned into the home directory of the VMs in your [AWS] (../aws/README.md) environment.
+The examples included here are designed to introduce specific features and provide a basic learning experience. The examples subdirectory is automatically provisioned into the home directory of the VMs in your cloud environment.

- Nomad
-  - [Spark Integration](spark/README.md)
+- [Spark Integration](spark/README.md)
--- a/terraform/examples/spark/README.md
+++ b/terraform/examples/spark/README.md
@@ -1,8 +1,12 @@
-## Spark integration
+# Nomad / Spark integration

-`cd` to `examples/spark/spark` on one of the servers. The `spark/spark` subdirectory will be created when the cluster is provisioned.
+Spark supports using a Nomad cluster to run Spark applications. When running on Nomad, the Spark executors that run Spark tasks for your application, and optionally the application driver itself, run as Nomad tasks in a Nomad job. See the [usage guide](./RunningSparkOnNomad.pdf) for more details.

-You can use the spark-submit commands below to run several of the official Spark examples against Nomad. You can monitor Nomad status simulaneously with:
+To give the Spark integration a test drive `cd` to `examples/spark/spark` on one of the servers (the `examples/spark/spark` subdirectory will be created when the cluster is provisioned).
+
+A number of sample Spark commands are listed below. These demonstrate some of the official examples as well as features like spark-sql, spark-shell and dataframes.
+
+You can monitor Nomad status simulaneously with:

 ```bash
 $ nomad status
@@ -10,45 +14,69 @@ $ nomad status [JOB_ID]
 $ nomad alloc-status [ALLOC_ID]
 ```

+## Sample Spark commands
+
 ### SparkPi

-Java
+Java (client mode)

 ```bash
-$ ./bin/spark-submit --class org.apache.spark.examples.JavaSparkPi --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-SNAPSHOT-bin-nomad-spark.tgz examples/jars/spark-examples*.jar 100
+$ ./bin/spark-submit --class org.apache.spark.examples.JavaSparkPi --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz examples/jars/spark-examples*.jar 100
 ```

-Python
+Java (cluster mode)
+
+./bin/spark-submit --class org.apache.spark.examples.JavaSparkPi --master nomad --deploy-mode cluster --conf spark.executor.instances=4 --conf spark.nomad.cluster.monitorUntil=complete --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz https://s3.amazonaws.com/rcgenova-nomad-spark/spark-examples_2.11-2.1.0-SNAPSHOT.jar 100
+
+Python (client mode)

 ```bash
-$ ./bin/spark-submit --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-SNAPSHOT-bin-nomad-spark.tgz examples/src/main/python/pi.py 100
+$ ./bin/spark-submit --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz examples/src/main/python/pi.py 100
 ```

-Scala
+Python (cluster mode)
+
+./bin/spark-submit --master nomad --deploy-mode cluster --conf spark.executor.instances=4 --conf spark.nomad.cluster.monitorUntil=complete --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz examples/src/main/python/pi.py 100
+
+Scala, (client mode)

 ```bash
-$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-SNAPSHOT-bin-nomad-spark.tgz examples/jars/spark-examples*.jar 100
+$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz examples/jars/spark-examples*.jar 100
 ```

-## Machine Learning
+###  Machine Learning

-Python
+Python (client mode)

 ```bash
-$ ./bin/spark-submit --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-SNAPSHOT-bin-nomad-spark.tgz examples/src/main/python/ml/logistic_regression_with_elastic_net.py
+$ ./bin/spark-submit --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz examples/src/main/python/ml/logistic_regression_with_elastic_net.py
 ```

-Scala
+Scala (client mode)

 ```bash
-$ ./bin/spark-submit --class org.apache.spark.examples.SparkLR --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-SNAPSHOT-bin-nomad-spark.tgz examples/jars/spark-examples*.jar
+$ ./bin/spark-submit --class org.apache.spark.examples.SparkLR --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz examples/jars/spark-examples*.jar
 ```

-## pyspark
+### Streaming
+
+Run these commands simultaneously:

 ```bash
-$ ./bin/pyspark --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-SNAPSHOT-bin-nomad-spark.tgz
+bin/spark-submit --class org.apache.spark.examples.streaming.clickstream.PageViewGenerator --master nomad --deploy-mode cluster --conf spark.executor.instances=4 --conf spark.nomad.cluster.monitorUntil=complete --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz https://s3.amazonaws.com/rcgenova-nomad-spark/spark-examples_2.11-2.1.0-SNAPSHOT.jar 44444 10
+```

+```bash
+bin/spark-submit --class org.apache.spark.examples.streaming.clickstream.PageViewStream --master nomad --deploy-mode cluster --conf spark.executor.instances=4 --conf spark.nomad.cluster.monitorUntil=complete --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz https://s3.amazonaws.com/rcgenova-nomad-spark/spark-examples_2.11-2.1.0-SNAPSHOT.jar errorRatePerZipCode localhost 44444
+```
+
+###  pyspark
+
+```bash
+$ ./bin/pyspark --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz
+```
+
+```bash
 df = spark.read.json("examples/src/main/resources/people.json")
 df.show()
 df.printSchema()
@@ -57,11 +85,15 @@ sqlDF = spark.sql("SELECT * FROM people")
 sqlDF.show()
 ```

-## spark-shell
+### spark-shell

 ```bash
-$ ./bin/spark-shell --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-SNAPSHOT-bin-nomad-spark.tgz
+$ ./bin/spark-shell --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz
+```

+From spark-shell:
+
+```bash
 :type spark
 spark.version

@@ -70,10 +102,33 @@ val distData = sc.parallelize(data)
 distData.filter(_ < 10).collect()
 ```

-## spark-sql
+### spark-sql

 ```bash
-$ ./bin/spark-sql --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-SNAPSHOT-bin-nomad-spark.tgz jars/spark-sql_2.11-2.1.0-SNAPSHOT.jar
+bin/spark-sql --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz jars/spark-sql_2.11-2.1.0-SNAPSHOT.jar
 ```

+From spark-shell:

+```bash
+CREATE TEMPORARY VIEW usersTable
+USING org.apache.spark.sql.parquet
+OPTIONS (
+  path "examples/src/main/resources/users.parquet"
+);
+
+SELECT * FROM usersTable;
+```
+
+### Data frames
+
+```bash
+bin/spark-shell --master nomad --conf spark.executor.instances=8 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/rcgenova-nomad-spark/spark-2.1.0-bin-nomad-preview-6.tgz
+```
+
+From spark-shell:
+
+```bash
+val usersDF = spark.read.load("examples/src/main/resources/users.parquet")
+usersDF.select("name", "favorite_color").write.save("/tmp/namesAndFavColors.parquet")
+```