update READMEs

2026-01-01 16:05:42 +03:00 · 2017-06-24 16:50:11 -07:00
parent 34825af250
commit e1429c680c
2 changed files with 11 additions and 18 deletions
--- a/terraform/README.md
+++ b/terraform/README.md
@@ -29,9 +29,9 @@ $ export AWS_ACCESS_KEY_ID=[ACCESS_KEY_ID]
 $ export AWS_SECRET_ACCESS_KEY=[SECRET_ACCESS_KEY]
 ```

-## Provision
+## Provision a cluster

-`cd` to one of the environment subdirectories:
+`cd` to an environment subdirectory:

 ```bash
 $ cd env/us-east
@@ -41,9 +41,9 @@ Update terraform.tfvars with your SSH key name:

 ```bash
 region                  = "us-east-1"
-ami                     = "ami-28a1dd3e"
+ami                     = "ami-feac99e8"
 instance_type           = "t2.medium"
-key_name                = "KEY"
+key_name                = "KEY_NAME"
 server_count            = "3"
 client_count            = "4"
 ```
@@ -51,7 +51,7 @@ For example:

 ```bash
 region                  = "us-east-1"
-ami                     = "ami-28a1dd3e"
+ami                     = "ami-feac99e8"
 instance_type           = "t2.medium"
 key_name                = "hashi-us-east-1"
 server_count            = "3"
@@ -70,10 +70,10 @@ terraform apply

 ## Access the cluster

-SSH to a server using its public IP. For example:
+SSH to any client or server using its public IP. For example:

 ```bash
-$ ssh -i /home/vagrant/.ssh/KEY.pem ubuntu@SERVER_PUBLIC_IP
+$ ssh -i /home/vagrant/.ssh/KEY.pem ubuntu@PUBLIC_IP
 ```

 The AWS security group is configured by default to allow all traffic over port 22. This is not recommended for production deployments.
@@ -107,10 +107,4 @@ See:

 ## Apache Spark integration

-Nomad is well-suited for analytical workloads, given its performance characteristics and first-class support for batch scheduling. Apache Spark is a popular data processing engine/framework that has been architected to use third-party schedulers. We maintain a fork that natively integrates Nomad with Spark. Sample job files and documentation are included [here](examples/spark/README.md) and also provisioned into the cluster itself under the `HOME` directory.
-
-
-
-
-
-
+Nomad is well-suited for analytical workloads, given its performance characteristics and first-class support for batch scheduling. Apache Spark is a popular data processing engine/framework that has been architected to use third-party schedulers. The Nomad ecosystem includes a fork that natively integrates Nomad with Spark. A detailed walkthrough of the integration is included [here](examples/spark/README.md).
--- a/terraform/examples/spark/README.md
+++ b/terraform/examples/spark/README.md
@@ -38,21 +38,20 @@ $ hdfs dfs -ls /
 Finally, deploy the Spark history server:

 ```bash
-$ cd $HOME/examples/spark
 $ nomad run spark-history-server-hdfs.nomad
 ```

-You can find the private IP for the service with a Consul DNS lookup:
+You can get the private IP for the history server with a Consul DNS lookup:

 ```bash
 $ dig spark-history.service.consul
 ```

-Cross-reference the private IP with the `terraforom apply` output to get the corresponding public IP. You can access the history server at http://PUBLIC_IP:18080
+Cross-reference the private IP with the `terraforom apply` output to get the corresponding public IP. You can access the history server at `http://PUBLIC_IP:18080`.

 ## Sample Spark jobs

-A number of sample spark-submit commands are listed below that demonstrate several of the official Spark examples. Features like `spark-sql`, `spark-shell` and pyspark are included as well. The commands can be executed from any client or server.
+The sample `spark-submit` commands listed below demonstrate several of the official Spark examples. Features like `spark-sql`, `spark-shell` and pyspark are included. The commands can be executed from any client or server.

 ### SparkPi (Java)