mirror of
https://github.com/kemko/nomad.git
synced 2026-01-06 18:35:44 +03:00
Remove unlinked getting started pages. These are all on Learn now
This commit is contained in:
@@ -1,221 +0,0 @@
|
||||
---
|
||||
layout: intro
|
||||
page_title: Clustering
|
||||
sidebar_title: Clustering
|
||||
description: Join another Nomad client to create your first cluster.
|
||||
---
|
||||
|
||||
# Clustering
|
||||
|
||||
We have started our first agent and run a job against it in development mode.
|
||||
This demonstrates the ease of use and the workflow of Nomad, but did not show how
|
||||
this could be extended to a scalable, production-grade configuration. In this step,
|
||||
we will create our first real cluster with multiple nodes.
|
||||
|
||||
## Starting the Server
|
||||
|
||||
The first step is to create the config file for the server. Either download the
|
||||
[file from the repository][server.hcl], or paste this into a file called
|
||||
`server.hcl`:
|
||||
|
||||
```hcl
|
||||
# Increase log verbosity
|
||||
log_level = "DEBUG"
|
||||
|
||||
# Setup data dir
|
||||
data_dir = "/tmp/server1"
|
||||
|
||||
# Enable the server
|
||||
server {
|
||||
enabled = true
|
||||
|
||||
# Self-elect, should be 3 or 5 for production
|
||||
bootstrap_expect = 1
|
||||
}
|
||||
```
|
||||
|
||||
This is a fairly minimal server configuration file, but it
|
||||
is enough to start an agent in server only mode and have it
|
||||
elected as a leader. The major change that should be made for
|
||||
production is to run more than one server, and to change the
|
||||
corresponding `bootstrap_expect` value.
|
||||
|
||||
Once the file is created, start the agent in a new tab:
|
||||
|
||||
```shell-session
|
||||
$ nomad agent -config server.hcl
|
||||
==> WARNING: Bootstrap mode enabled! Potentially unsafe operation.
|
||||
==> Starting Nomad agent...
|
||||
==> Nomad agent configuration:
|
||||
|
||||
Client: false
|
||||
Log Level: DEBUG
|
||||
Region: global (DC: dc1)
|
||||
Server: true
|
||||
Version: 0.7.0
|
||||
|
||||
==> Nomad agent started! Log data will stream in below:
|
||||
|
||||
[INFO] serf: EventMemberJoin: nomad.global 127.0.0.1
|
||||
[INFO] nomad: starting 4 scheduling worker(s) for [service batch _core]
|
||||
[INFO] raft: Node at 127.0.0.1:4647 [Follower] entering Follower state
|
||||
[INFO] nomad: adding server nomad.global (Addr: 127.0.0.1:4647) (DC: dc1)
|
||||
[WARN] raft: Heartbeat timeout reached, starting election
|
||||
[INFO] raft: Node at 127.0.0.1:4647 [Candidate] entering Candidate state
|
||||
[DEBUG] raft: Votes needed: 1
|
||||
[DEBUG] raft: Vote granted. Tally: 1
|
||||
[INFO] raft: Election won. Tally: 1
|
||||
[INFO] raft: Node at 127.0.0.1:4647 [Leader] entering Leader state
|
||||
[INFO] nomad: cluster leadership acquired
|
||||
[INFO] raft: Disabling EnableSingleNode (bootstrap)
|
||||
[DEBUG] raft: Node 127.0.0.1:4647 updated peer set (2): [127.0.0.1:4647]
|
||||
```
|
||||
|
||||
We can see above that client mode is disabled, and that we are
|
||||
only running as the server. This means that this server will manage
|
||||
state and make scheduling decisions but will not run any tasks.
|
||||
Now we need some agents to run tasks!
|
||||
|
||||
## Starting the Clients
|
||||
|
||||
Similar to the server, we must first configure the clients. Either download
|
||||
the configuration for `client1` and `client2` from the
|
||||
[repository here](https://github.com/hashicorp/nomad/tree/master/demo/vagrant), or
|
||||
paste the following into `client1.hcl`:
|
||||
|
||||
```hcl
|
||||
# Increase log verbosity
|
||||
log_level = "DEBUG"
|
||||
|
||||
# Setup data dir
|
||||
data_dir = "/tmp/client1"
|
||||
|
||||
# Give the agent a unique name. Defaults to hostname
|
||||
name = "client1"
|
||||
|
||||
# Enable the client
|
||||
client {
|
||||
enabled = true
|
||||
|
||||
# For demo assume we are talking to server1. For production,
|
||||
# this should be like "nomad.service.consul:4647" and a system
|
||||
# like Consul used for service discovery.
|
||||
servers = ["127.0.0.1:4647"]
|
||||
}
|
||||
|
||||
# Modify our port to avoid a collision with server1
|
||||
ports {
|
||||
http = 5656
|
||||
}
|
||||
```
|
||||
|
||||
Copy that file to `client2.hcl`. Change the `data_dir` to be `/tmp/client2`,
|
||||
the `name` to `client2`, and the `http` port to 5657. Once you have created
|
||||
both `client1.hcl` and `client2.hcl`, open a tab for each and start the agents:
|
||||
|
||||
```shell-session
|
||||
$ sudo nomad agent -config client1.hcl
|
||||
==> Starting Nomad agent...
|
||||
==> Nomad agent configuration:
|
||||
|
||||
Client: true
|
||||
Log Level: DEBUG
|
||||
Region: global (DC: dc1)
|
||||
Server: false
|
||||
Version: 0.7.0
|
||||
|
||||
==> Nomad agent started! Log data will stream in below:
|
||||
|
||||
[DEBUG] client: applied fingerprints [host memory storage arch cpu]
|
||||
[DEBUG] client: available drivers [docker exec]
|
||||
[DEBUG] client: node registration complete
|
||||
...
|
||||
```
|
||||
|
||||
In the output we can see the agent is running in client mode only.
|
||||
This agent will be available to run tasks but will not participate
|
||||
in managing the cluster or making scheduling decisions.
|
||||
|
||||
Using the [`node status` command](/docs/commands/node/status)
|
||||
we should see both nodes in the `ready` state:
|
||||
|
||||
```shell-session
|
||||
$ nomad node status
|
||||
ID DC Name Class Drain Eligibility Status
|
||||
fca62612 dc1 client1 <none> false eligible ready
|
||||
c887deef dc1 client2 <none> false eligible ready
|
||||
```
|
||||
|
||||
We now have a simple three node cluster running. The only difference
|
||||
between a demo and full production cluster is that we are running a
|
||||
single server instead of three or five.
|
||||
|
||||
## Submit a Job
|
||||
|
||||
Now that we have a simple cluster, we can use it to schedule a job.
|
||||
We should still have the `example.nomad` job file from before, but
|
||||
verify that the `count` is still set to 3.
|
||||
|
||||
Then, use the [`job run` command](/docs/commands/job/run) to submit the job:
|
||||
|
||||
```shell-session
|
||||
$ nomad job run example.nomad
|
||||
==> Monitoring evaluation "8e0a7cf9"
|
||||
Evaluation triggered by job "example"
|
||||
Evaluation within deployment: "0917b771"
|
||||
Allocation "501154ac" created: node "c887deef", group "cache"
|
||||
Allocation "7e2b3900" created: node "fca62612", group "cache"
|
||||
Allocation "9c66fcaf" created: node "c887deef", group "cache"
|
||||
Evaluation status changed: "pending" -> "complete"
|
||||
==> Evaluation "8e0a7cf9" finished with status "complete"
|
||||
```
|
||||
|
||||
We can see in the output that the scheduler assigned two of the
|
||||
tasks for one of the client nodes and the remaining task to the
|
||||
second client.
|
||||
|
||||
We can again use the [`status` command](/docs/commands/status) to verify:
|
||||
|
||||
```shell-session
|
||||
$ nomad status example
|
||||
ID = example
|
||||
Name = example
|
||||
Submit Date = 07/26/17 16:34:58 UTC
|
||||
Type = service
|
||||
Priority = 50
|
||||
Datacenters = dc1
|
||||
Status = running
|
||||
Periodic = false
|
||||
Parameterized = false
|
||||
|
||||
Summary
|
||||
Task Group Queued Starting Running Failed Complete Lost
|
||||
cache 0 0 3 0 0 0
|
||||
|
||||
Latest Deployment
|
||||
ID = fc49bd6c
|
||||
Status = running
|
||||
Description = Deployment is running
|
||||
|
||||
Deployed
|
||||
Task Group Desired Placed Healthy Unhealthy
|
||||
cache 3 3 0 0
|
||||
|
||||
Allocations
|
||||
ID Eval ID Node ID Task Group Desired Status Created At
|
||||
501154ac 8e0a7cf9 c887deef cache run running 08/08/16 21:03:19 CDT
|
||||
7e2b3900 8e0a7cf9 fca62612 cache run running 08/08/16 21:03:19 CDT
|
||||
9c66fcaf 8e0a7cf9 c887deef cache run running 08/08/16 21:03:19 CDT
|
||||
```
|
||||
|
||||
We can see that all our tasks have been allocated and are running.
|
||||
Once we are satisfied that our job is happily running, we can tear
|
||||
it down with `nomad job stop`.
|
||||
|
||||
## Next Steps
|
||||
|
||||
Nomad is now up and running. The cluster can be entirely managed from the command line,
|
||||
but Nomad also comes with a web interface that is hosted alongside the HTTP API.
|
||||
Next, we'll [visit the UI in the browser](/intro/getting-started/ui).
|
||||
|
||||
[server.hcl]: https://raw.githubusercontent.com/hashicorp/nomad/master/demo/vagrant/server.hcl
|
||||
@@ -1,91 +0,0 @@
|
||||
---
|
||||
layout: intro
|
||||
page_title: Install Nomad
|
||||
sidebar_title: Getting Started
|
||||
description: The first step to using Nomad is to get it installed.
|
||||
---
|
||||
|
||||
# Install Nomad
|
||||
|
||||
To simplify the getting started experience, you can download the precompiled
|
||||
binary and run it directly (see the instructions for installing Nomad
|
||||
[here][nomad-install]) or you can optionally work in a Vagrant environment
|
||||
(detailed in the following section).
|
||||
|
||||
## Vagrant Setup (Optional)
|
||||
|
||||
Note: To use the Vagrant Setup first install Vagrant following these
|
||||
[instructions][install-instructions].
|
||||
|
||||
Create a new directory, and download [this
|
||||
`Vagrantfile`](https://raw.githubusercontent.com/hashicorp/nomad/master/demo/vagrant/Vagrantfile).
|
||||
|
||||
Once you have created a new directory and downloaded the `Vagrantfile` you must
|
||||
create the virtual machine:
|
||||
|
||||
```shell-session
|
||||
$ vagrant up
|
||||
```
|
||||
|
||||
This will take a few minutes as the base Ubuntu box must be downloaded
|
||||
and provisioned with both Docker and Nomad. Once this completes, you should
|
||||
see output similar to:
|
||||
|
||||
```text
|
||||
Bringing machine 'default' up with 'virtualbox' provider...
|
||||
==> default: Importing base box 'bento/ubuntu-16.04'...
|
||||
...
|
||||
==> default: Running provisioner: docker...
|
||||
|
||||
```
|
||||
|
||||
At this point the Vagrant box is running and ready to go.
|
||||
|
||||
## Verifying the Installation
|
||||
|
||||
After starting the Vagrant box, verify the installation worked by connecting
|
||||
to the box using SSH and checking that `nomad` is available. By executing
|
||||
`nomad`, you should see help output similar to the following:
|
||||
|
||||
```shell-session
|
||||
$ vagrant ssh
|
||||
...
|
||||
|
||||
vagrant@nomad:~$ nomad
|
||||
Usage: nomad [-version] [-help] [-autocomplete-(un)install] <command> [args]
|
||||
|
||||
Common commands:
|
||||
run Run a new job or update an existing job
|
||||
stop Stop a running job
|
||||
status Display the status output for a resource
|
||||
alloc Interact with allocations
|
||||
job Interact with jobs
|
||||
node Interact with nodes
|
||||
agent Runs a Nomad agent
|
||||
|
||||
Other commands:
|
||||
acl Interact with ACL policies and tokens
|
||||
agent-info Display status information about the local agent
|
||||
deployment Interact with deployments
|
||||
eval Interact with evaluations
|
||||
namespace Interact with namespaces
|
||||
operator Provides cluster-level tools for Nomad operators
|
||||
quota Interact with quotas
|
||||
sentinel Interact with Sentinel policies
|
||||
server Interact with servers
|
||||
ui Open the Nomad Web UI
|
||||
version Prints the Nomad version
|
||||
```
|
||||
|
||||
If you get an error that Nomad could not be found, then your Vagrant box
|
||||
may not have provisioned correctly. Check for any error messages that may have
|
||||
been emitted during `vagrant up`. You can always [destroy the box][destroy] and
|
||||
re-create it.
|
||||
|
||||
## Next Steps
|
||||
|
||||
Nomad is installed. Let's [start Nomad](/intro/getting-started/running)!
|
||||
|
||||
[nomad-install]: /docs/install#installing-nomad
|
||||
[destroy]: https://www.vagrantup.com/docs/cli/destroy
|
||||
[install-instructions]: https://www.vagrantup.com/docs/installation
|
||||
@@ -1,345 +0,0 @@
|
||||
---
|
||||
layout: intro
|
||||
page_title: Jobs
|
||||
sidebar_title: Jobs
|
||||
description: 'Learn how to submit, modify and stop jobs in Nomad.'
|
||||
---
|
||||
|
||||
# Jobs
|
||||
|
||||
Jobs are the primary configuration that users interact with when using
|
||||
Nomad. A job is a declarative specification of tasks that Nomad should run.
|
||||
Jobs have a globally unique name, one or many task groups, which are themselves
|
||||
collections of one or many tasks.
|
||||
|
||||
The format of the jobs is documented in the [job specification][jobspec]. They
|
||||
can either be specified in [HashiCorp Configuration Language][hcl] or JSON,
|
||||
however we recommend only using JSON when the configuration is generated by a machine.
|
||||
|
||||
## Running a Job
|
||||
|
||||
To get started, we will use the [`job init` command](/docs/commands/job/init) which
|
||||
generates a skeleton job file:
|
||||
|
||||
```shell-session
|
||||
$ nomad job init
|
||||
Example job file written to example.nomad
|
||||
```
|
||||
|
||||
You can view the contents of this file by running `cat example.nomad`. In this
|
||||
example job file, we have declared a single task 'redis' which is using
|
||||
the Docker driver to run the task. The primary way you interact with Nomad
|
||||
is with the [`job run` command](/docs/commands/job/run). The `run` command takes
|
||||
a job file and registers it with Nomad. This is used both to register new
|
||||
jobs and to update existing jobs.
|
||||
|
||||
We can register our example job now:
|
||||
|
||||
```shell-session
|
||||
$ nomad job run example.nomad
|
||||
==> Monitoring evaluation "13ebb66d"
|
||||
Evaluation triggered by job "example"
|
||||
Allocation "883269bf" created: node "e42d6f19", group "cache"
|
||||
Evaluation within deployment: "b0a84e74"
|
||||
Evaluation status changed: "pending" -> "complete"
|
||||
==> Evaluation "13ebb66d" finished with status "complete"
|
||||
```
|
||||
|
||||
Anytime a job is updated, Nomad creates an evaluation to determine what
|
||||
actions need to take place. In this case, because this is a new job, Nomad has
|
||||
determined that an allocation should be created and has scheduled it on our
|
||||
local agent.
|
||||
|
||||
To inspect the status of our job we use the [`status` command](/docs/commands/status):
|
||||
|
||||
```shell-session
|
||||
$ nomad status example
|
||||
ID = example
|
||||
Name = example
|
||||
Submit Date = 10/31/17 22:58:40 UTC
|
||||
Type = service
|
||||
Priority = 50
|
||||
Datacenters = dc1
|
||||
Status = running
|
||||
Periodic = false
|
||||
Parameterized = false
|
||||
|
||||
Summary
|
||||
Task Group Queued Starting Running Failed Complete Lost
|
||||
cache 0 0 1 0 0 0
|
||||
|
||||
Latest Deployment
|
||||
ID = b0a84e74
|
||||
Status = successful
|
||||
Description = Deployment completed successfully
|
||||
|
||||
Deployed
|
||||
Task Group Desired Placed Healthy Unhealthy
|
||||
cache 1 1 1 0
|
||||
|
||||
Allocations
|
||||
ID Node ID Task Group Version Desired Status Created Modified
|
||||
8ba85cef 171a583b cache 0 run running 5m ago 5m ago
|
||||
```
|
||||
|
||||
Here we can see that the result of our evaluation was the creation of an
|
||||
allocation that is now running on the local node.
|
||||
|
||||
An allocation represents an instance of Task Group placed on a node. To inspect
|
||||
an allocation we use the [`alloc status` command](/docs/commands/alloc/status):
|
||||
|
||||
```shell-session
|
||||
$ nomad alloc status 8ba85cef
|
||||
ID = 8ba85cef
|
||||
Eval ID = 13ebb66d
|
||||
Name = example.cache[0]
|
||||
Node ID = e42d6f19
|
||||
Job ID = example
|
||||
Job Version = 0
|
||||
Client Status = running
|
||||
Client Description = <none>
|
||||
Desired Status = run
|
||||
Desired Description = <none>
|
||||
Created = 5m ago
|
||||
Modified = 5m ago
|
||||
Deployment ID = fa882a5b
|
||||
Deployment Health = healthy
|
||||
|
||||
Task "redis" is "running"
|
||||
Task Resources
|
||||
CPU Memory Disk Addresses
|
||||
8/500 MHz 6.3 MiB/256 MiB 300 MiB db: 127.0.0.1:22672
|
||||
|
||||
Task Events:
|
||||
Started At = 10/31/17 22:58:49 UTC
|
||||
Finished At = N/A
|
||||
Total Restarts = 0
|
||||
Last Restart = N/A
|
||||
|
||||
Recent Events:
|
||||
Time Type Description
|
||||
10/31/17 22:58:49 UTC Started Task started by client
|
||||
10/31/17 22:58:40 UTC Driver Downloading image redis:3.2
|
||||
10/31/17 22:58:40 UTC Task Setup Building Task Directory
|
||||
10/31/17 22:58:40 UTC Received Task received by client
|
||||
```
|
||||
|
||||
We can see that Nomad reports the state of the allocation as well as its
|
||||
current resource usage. By supplying the `-stats` flag, more detailed resource
|
||||
usage statistics will be reported.
|
||||
|
||||
To see the logs of a task, we can use the [`logs` command](/docs/commands/alloc/logs):
|
||||
|
||||
````shell-session
|
||||
$ nomad alloc logs 8ba85cef redis
|
||||
_._
|
||||
_.-``__ ''-._
|
||||
_.-`` `. `_. ''-._ Redis 3.2.1 (00000000/0) 64 bit
|
||||
.-`` .-```. ```\/ _.,_ ''-._
|
||||
( ' , .-` | `, ) Running in standalone mode
|
||||
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
|
||||
| `-._ `._ / _.-' | PID: 1
|
||||
`-._ `-._ `-./ _.-' _.-'
|
||||
|`-._`-._ `-.__.-' _.-'_.-'|
|
||||
| `-._`-._ _.-'_.-' | http://redis.io
|
||||
`-._ `-._`-.__.-'_.-' _.-'
|
||||
|`-._`-._ `-.__.-' _.-'_.-'|
|
||||
| `-._`-._ _.-'_.-' |
|
||||
`-._ `-._`-.__.-'_.-' _.-'
|
||||
`-._ `-.__.-' _.-'
|
||||
`-._ _.-'
|
||||
`-.__.-'
|
||||
...
|
||||
````
|
||||
|
||||
## Modifying a Job
|
||||
|
||||
The definition of a job is not static, and is meant to be updated over time.
|
||||
You may update a job to change the docker container, to update the application version,
|
||||
or to change the count of a task group to scale with load.
|
||||
|
||||
For now, edit the `example.nomad` file to update the count and set it to 3:
|
||||
|
||||
```
|
||||
# The "count" parameter specifies the number of the task groups that should
|
||||
# be running under this group. This value must be non-negative and defaults
|
||||
# to 1.
|
||||
count = 3
|
||||
```
|
||||
|
||||
Once you have finished modifying the job specification, use the [`job plan`
|
||||
command](/docs/commands/job/plan) to invoke a dry-run of the scheduler to see
|
||||
what would happen if you ran the updated job:
|
||||
|
||||
```shell-session
|
||||
$ nomad job plan example.nomad
|
||||
+/- Job: "example"
|
||||
+/- Task Group: "cache" (2 create, 1 in-place update)
|
||||
+/- Count: "1" => "3" (forces create)
|
||||
Task: "redis"
|
||||
|
||||
Scheduler dry-run:
|
||||
- All tasks successfully allocated.
|
||||
|
||||
Job Modify Index: 7
|
||||
To submit the job with version verification run:
|
||||
|
||||
nomad job run -check-index 7 example.nomad
|
||||
|
||||
When running the job with the check-index flag, the job will only be run if the
|
||||
job modify index given matches the server-side version. If the index has
|
||||
changed, another user has modified the job and the plan's results are
|
||||
potentially invalid.
|
||||
```
|
||||
|
||||
We can see that the scheduler detected the change in count and informs us that
|
||||
it will cause 2 new instances to be created. The in-place update that will
|
||||
occur is to push the updated job specification to the existing allocation and
|
||||
will not cause any service interruption. We can then run the job with the run
|
||||
command the `plan` emitted.
|
||||
|
||||
By running with the `-check-index` flag, Nomad checks that the job has not
|
||||
been modified since the plan was run. This is useful if multiple people are
|
||||
interacting with the job at the same time to ensure the job hasn't changed
|
||||
before you apply your modifications.
|
||||
|
||||
```shell-session
|
||||
$ nomad job run -check-index 7 example.nomad
|
||||
==> Monitoring evaluation "93d16471"
|
||||
Evaluation triggered by job "example"
|
||||
Evaluation within deployment: "0d06e1b6"
|
||||
Allocation "3249e320" created: node "e42d6f19", group "cache"
|
||||
Allocation "453b210f" created: node "e42d6f19", group "cache"
|
||||
Allocation "883269bf" modified: node "e42d6f19", group "cache"
|
||||
Evaluation status changed: "pending" -> "complete"
|
||||
==> Evaluation "93d16471" finished with status "complete"
|
||||
```
|
||||
|
||||
Because we set the count of the task group to three, Nomad created two
|
||||
additional allocations to get to the desired state. It is idempotent to
|
||||
run the same job specification again and no new allocations will be created.
|
||||
|
||||
Now, let's try to do an application update. In this case, we will simply change
|
||||
the version of redis we want to run. Edit the `example.nomad` file and change
|
||||
the Docker image from "redis:3.2" to "redis:4.0":
|
||||
|
||||
```
|
||||
# Configure Docker driver with the image
|
||||
config {
|
||||
image = "redis:4.0"
|
||||
}
|
||||
```
|
||||
|
||||
We can run `plan` again to see what will happen if we submit this change:
|
||||
|
||||
```text
|
||||
+/- Job: "example"
|
||||
+/- Task Group: "cache" (1 create/destroy update, 2 ignore)
|
||||
+/- Task: "redis" (forces create/destroy update)
|
||||
+/- Config {
|
||||
+/- image: "redis:3.2" => "redis:4.0"
|
||||
port_map[0][db]: "6379"
|
||||
}
|
||||
|
||||
Scheduler dry-run:
|
||||
- All tasks successfully allocated.
|
||||
|
||||
Job Modify Index: 1127
|
||||
To submit the job with version verification run:
|
||||
|
||||
nomad job run -check-index 1127 example.nomad
|
||||
|
||||
When running the job with the check-index flag, the job will only be run if the
|
||||
job modify index given matches the server-side version. If the index has
|
||||
changed, another user has modified the job and the plan's results are
|
||||
potentially invalid.
|
||||
```
|
||||
|
||||
The plan output shows us that one allocation will be updated and that the other
|
||||
two will be ignored. This is due to the `max_parallel` setting in the `update`
|
||||
stanza, which is set to 1 to instruct Nomad to perform only a single change at
|
||||
a time.
|
||||
|
||||
Once ready, use `run` to push the updated specification:
|
||||
|
||||
```shell-session
|
||||
$ nomad job run example.nomad
|
||||
==> Monitoring evaluation "293b313a"
|
||||
Evaluation triggered by job "example"
|
||||
Evaluation within deployment: "f4047b3a"
|
||||
Allocation "27bd4a41" created: node "e42d6f19", group "cache"
|
||||
Evaluation status changed: "pending" -> "complete"
|
||||
==> Evaluation "293b313a" finished with status "complete"
|
||||
```
|
||||
|
||||
After running, the rolling upgrade can be followed by running `nomad status` and
|
||||
watching the deployed count.
|
||||
|
||||
We can see that Nomad handled the update in three phases, only updating a single
|
||||
allocation in each phase and waiting for it to be healthy for `min_healthy_time`
|
||||
of 10 seconds before moving on to the next. The update strategy can be
|
||||
configured, but rolling updates makes it easy to upgrade an application at large
|
||||
scale.
|
||||
|
||||
## Stopping a Job
|
||||
|
||||
So far we've created, run and modified a job. The final step in a job lifecycle
|
||||
is stopping the job. This is done with the [`job stop` command](/docs/commands/job/stop):
|
||||
|
||||
```shell-session
|
||||
$ nomad job stop example
|
||||
==> Monitoring evaluation "6d4cd6ca"
|
||||
Evaluation triggered by job "example"
|
||||
Evaluation within deployment: "f4047b3a"
|
||||
Evaluation status changed: "pending" -> "complete"
|
||||
==> Evaluation "6d4cd6ca" finished with status "complete"
|
||||
```
|
||||
|
||||
When we stop a job, it creates an evaluation which is used to stop all
|
||||
the existing allocations. If we now query the job status, we can see it is
|
||||
now marked as `dead (stopped)`, indicating that the job has been stopped and
|
||||
Nomad is no longer running it:
|
||||
|
||||
```shell-session
|
||||
$ nomad status example
|
||||
ID = example
|
||||
Name = example
|
||||
Submit Date = 11/01/17 17:30:40 UTC
|
||||
Type = service
|
||||
Priority = 50
|
||||
Datacenters = dc1
|
||||
Status = dead (stopped)
|
||||
Periodic = false
|
||||
Parameterized = false
|
||||
|
||||
Summary
|
||||
Task Group Queued Starting Running Failed Complete Lost
|
||||
cache 0 0 0 0 6 0
|
||||
|
||||
Latest Deployment
|
||||
ID = f4047b3a
|
||||
Status = successful
|
||||
Description = Deployment completed successfully
|
||||
|
||||
Deployed
|
||||
Task Group Desired Placed Healthy Unhealthy
|
||||
cache 3 3 3 0
|
||||
|
||||
Allocations
|
||||
ID Node ID Task Group Version Desired Status Created Modified
|
||||
8ace140d 2cfe061e cache 2 stop complete 5m ago 5m ago
|
||||
8af5330a 2cfe061e cache 2 stop complete 6m ago 6m ago
|
||||
df50c3ae 2cfe061e cache 2 stop complete 6m ago 6m ago
|
||||
```
|
||||
|
||||
If we wanted to start the job again, we could simply `run` it again.
|
||||
|
||||
## Next Steps
|
||||
|
||||
Users of Nomad primarily interact with jobs, and we've now seen
|
||||
how to create and scale our job, perform an application update,
|
||||
and do a job tear down. Next we will add another Nomad
|
||||
client to [create our first cluster](/intro/getting-started/cluster)
|
||||
|
||||
[jobspec]: /docs/job-specification 'Nomad Job Specification'
|
||||
[hcl]: https://github.com/hashicorp/hcl 'HashiCorp Configuration Language'
|
||||
@@ -1,32 +0,0 @@
|
||||
---
|
||||
layout: intro
|
||||
page_title: Next Steps
|
||||
sidebar_title: Next Steps
|
||||
description: >-
|
||||
After completing the getting started guide, learn about what to do next with
|
||||
Nomad.
|
||||
---
|
||||
|
||||
# Next Steps
|
||||
|
||||
That concludes the getting started guide for Nomad. Hopefully you are
|
||||
excited about the possibilities of Nomad and ready to put this knowledge
|
||||
to use to improve your environment.
|
||||
|
||||
We've covered the basics of all the core features of Nomad in this guide.
|
||||
We recommend exploring the following resources as next steps.
|
||||
|
||||
- [HashiCorp Learn Guides](http://learn.hashicorp.com/nomad) - The Guides provide best practices and
|
||||
guidance for using and operating Nomad in a real-world production setting.
|
||||
|
||||
- [Docs](/docs) - The Docs provide detailed reference information
|
||||
all available features and options of Nomad.
|
||||
|
||||
- [Job Lifecycle](https://learn.hashicorp.com/collections/nomad/manage-jobs) - Additional details
|
||||
specific to running a job in production.
|
||||
|
||||
- [Creating a Cluster](https://learn.hashicorp.com/tutorials/nomad/clustering) - Additional
|
||||
details on joining nodes to create a multi-node Nomad Cluster.
|
||||
|
||||
- [Example Terraform configuration](https://github.com/hashicorp/nomad/tree/master/terraform) -
|
||||
Use Terraform to automatically provision a cluster in AWS.
|
||||
@@ -1,145 +0,0 @@
|
||||
---
|
||||
layout: intro
|
||||
page_title: Running Nomad
|
||||
sidebar_title: Running Nomad
|
||||
description: 'Learn about the Nomad agent, and the lifecycle of running and stopping.'
|
||||
---
|
||||
|
||||
# Running Nomad
|
||||
|
||||
Nomad relies on a long running agent on every machine in the cluster.
|
||||
The agent can run either in server or client mode. Each region must
|
||||
have at least one server, though a cluster of 3 or 5 servers is recommended.
|
||||
A single server deployment is _**highly**_ discouraged as data loss is inevitable
|
||||
in a failure scenario.
|
||||
|
||||
All other agents run in client mode. A Nomad client is a very lightweight
|
||||
process that registers the host machine, performs heartbeating, and runs the tasks
|
||||
that are assigned to it by the servers. The agent must be run on every node that
|
||||
is part of the cluster so that the servers can assign work to those machines.
|
||||
|
||||
## Starting the Agent
|
||||
|
||||
For simplicity, we will run a single Nomad agent in development mode. This mode
|
||||
is used to quickly start an agent that is acting as a client and server to test
|
||||
job configurations or prototype interactions. It should _**not**_ be used in
|
||||
production as it does not persist state.
|
||||
|
||||
```shell-session
|
||||
$ sudo nomad agent -dev
|
||||
|
||||
==> Starting Nomad agent...
|
||||
==> Nomad agent configuration:
|
||||
|
||||
Client: true
|
||||
Log Level: DEBUG
|
||||
Region: global (DC: dc1)
|
||||
Server: true
|
||||
|
||||
==> Nomad agent started! Log data will stream in below:
|
||||
|
||||
[INFO] serf: EventMemberJoin: nomad.global 127.0.0.1
|
||||
[INFO] nomad: starting 4 scheduling worker(s) for [service batch _core]
|
||||
[INFO] client: using alloc directory /tmp/NomadClient599911093
|
||||
[INFO] raft: Node at 127.0.0.1:4647 [Follower] entering Follower state
|
||||
[INFO] nomad: adding server nomad.global (Addr: 127.0.0.1:4647) (DC: dc1)
|
||||
[WARN] fingerprint.network: Ethtool not found, checking /sys/net speed file
|
||||
[WARN] raft: Heartbeat timeout reached, starting election
|
||||
[INFO] raft: Node at 127.0.0.1:4647 [Candidate] entering Candidate state
|
||||
[DEBUG] raft: Votes needed: 1
|
||||
[DEBUG] raft: Vote granted. Tally: 1
|
||||
[INFO] raft: Election won. Tally: 1
|
||||
[INFO] raft: Node at 127.0.0.1:4647 [Leader] entering Leader state
|
||||
[INFO] raft: Disabling EnableSingleNode (bootstrap)
|
||||
[DEBUG] raft: Node 127.0.0.1:4647 updated peer set (2): [127.0.0.1:4647]
|
||||
[INFO] nomad: cluster leadership acquired
|
||||
[DEBUG] client: applied fingerprints [arch cpu host memory storage network]
|
||||
[DEBUG] client: available drivers [docker exec java]
|
||||
[DEBUG] client: node registration complete
|
||||
[DEBUG] client: updated allocations at index 1 (0 allocs)
|
||||
[DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 0)
|
||||
[DEBUG] client: state updated to ready
|
||||
```
|
||||
|
||||
As you can see, the Nomad agent has started and has output some log
|
||||
data. From the log data, you can see that our agent is running in both
|
||||
client and server mode, and has claimed leadership of the cluster.
|
||||
Additionally, the local client has been registered and marked as ready.
|
||||
|
||||
-> **Note:** Typically any agent running in client mode must be run with root level
|
||||
privilege. Nomad makes use of operating system primitives for resource isolation
|
||||
which require elevated permissions. The agent will function as non-root, but
|
||||
certain task drivers will not be available.
|
||||
|
||||
## Cluster Nodes
|
||||
|
||||
If you run [`nomad node status`](/docs/commands/node/status) in another
|
||||
terminal, you can see the registered nodes of the Nomad cluster:
|
||||
|
||||
```shell-session
|
||||
$ nomad node status
|
||||
ID DC Name Class Drain Eligibility Status
|
||||
171a583b dc1 nomad <none> false eligible ready
|
||||
```
|
||||
|
||||
The output shows our Node ID, which is a randomly generated UUID,
|
||||
its datacenter, node name, node class, drain mode and current status.
|
||||
We can see that our node is in the ready state, and task draining is
|
||||
currently off.
|
||||
|
||||
The agent is also running in server mode, which means it is part of
|
||||
the [gossip protocol](/docs/internals/gossip) used to connect all
|
||||
the server instances together. We can view the members of the gossip
|
||||
ring using the [`server members`](/docs/commands/server/members) command:
|
||||
|
||||
```shell-session
|
||||
$ nomad server members
|
||||
Name Address Port Status Leader Protocol Build Datacenter Region
|
||||
nomad.global 127.0.0.1 4648 alive true 2 0.7.0 dc1 global
|
||||
```
|
||||
|
||||
The output shows our own agent, the address it is running on, its
|
||||
health state, some version information, and the datacenter and region.
|
||||
Additional metadata can be viewed by providing the `-detailed` flag.
|
||||
|
||||
## Stopping the Agent ((#stopping))
|
||||
|
||||
You can use `Ctrl-C` (the interrupt signal) to halt the agent.
|
||||
By default, all signals will cause the agent to forcefully shutdown.
|
||||
The agent [can be configured](/docs/configuration#leave_on_terminate) to
|
||||
gracefully leave on either the interrupt or terminate signals.
|
||||
|
||||
After interrupting the agent, you should see it leave the cluster
|
||||
and shut down:
|
||||
|
||||
```
|
||||
^C==> Caught signal: interrupt
|
||||
[DEBUG] http: Shutting down http server
|
||||
[INFO] agent: requesting shutdown
|
||||
[INFO] client: shutting down
|
||||
[INFO] nomad: shutting down server
|
||||
[WARN] serf: Shutdown without a Leave
|
||||
[INFO] agent: shutdown complete
|
||||
```
|
||||
|
||||
By gracefully leaving, Nomad clients update their status to prevent
|
||||
further tasks from being scheduled and to start migrating any tasks that are
|
||||
already assigned. Nomad servers notify their peers they intend to leave.
|
||||
When a server leaves, replication to that server stops. If a server fails,
|
||||
replication continues to be attempted until the node recovers. Nomad will
|
||||
automatically try to reconnect to _failed_ nodes, allowing it to recover from
|
||||
certain network conditions, while _left_ nodes are no longer contacted.
|
||||
|
||||
If an agent is operating as a server, [`leave_on_terminate`](/docs/configuration#leave_on_terminate) should only
|
||||
be set if the server will never rejoin the cluster again. The default value of `false` for `leave_on_terminate` and `leave_on_interrupt`
|
||||
work well for most scenarios. If Nomad servers are part of an auto scaling group where new servers are brought up to replace
|
||||
failed servers, using graceful leave avoids causing a potential availability outage affecting the [consensus protocol](/docs/internals/consensus).
|
||||
As of Nomad 0.8, Nomad includes Autopilot which automatically removes failed or dead servers. This allows the operator to skip setting `leave_on_terminate`.
|
||||
|
||||
If a server does forcefully exit and will not be returning into service, the
|
||||
[`server force-leave` command](/docs/commands/server/force-leave) should
|
||||
be used to force the server from a _failed_ to a _left_ state.
|
||||
|
||||
## Next Steps
|
||||
|
||||
If you shut down the development Nomad agent as instructed above, ensure that it is back up and running again and let's try to [run a job](/intro/getting-started/jobs)!
|
||||
@@ -1,63 +0,0 @@
|
||||
---
|
||||
layout: intro
|
||||
page_title: Web UI
|
||||
sidebar_title: Web UI
|
||||
description: 'Visit the Nomad Web UI to inspect jobs, allocations, and more.'
|
||||
---
|
||||
|
||||
# Web UI
|
||||
|
||||
At this point we have a fully functioning cluster with a job running in it. We have
|
||||
learned how to inspect a job using `nomad status`, next we'll learn how to inspect
|
||||
a job in the web client.
|
||||
|
||||
## Opening the Web UI
|
||||
|
||||
As long as Nomad is running, the Nomad UI is also running. It is hosted at the same address
|
||||
and port as the Nomad HTTP API under the `/ui` namespace.
|
||||
|
||||
With Nomad running, visit `http://localhost:4646` to open the Nomad UI.
|
||||
|
||||
[![Nomad UI Jobs List][img-jobs-list]][img-jobs-list]
|
||||
|
||||
If you can't connect it's possible that Vagrant was unable to properly map the
|
||||
port from your host to the VM. Your `vagrant up` output will contain the new
|
||||
port mapping:
|
||||
|
||||
```text
|
||||
==> default: Fixed port collision for 4646 => 4646. Now on port 2200.
|
||||
```
|
||||
|
||||
In the case above you would connect to `http://localhost:2200` instead.
|
||||
|
||||
## Inspecting a Job
|
||||
|
||||
You should be automatically redirected to `/ui/jobs` upon visiting the UI in your browser. This
|
||||
pages lists all jobs known to Nomad, regardless of status. Click the `example` job to inspect it.
|
||||
|
||||
[![Nomad UI Job Detail][img-job-detail]][img-job-detail]
|
||||
|
||||
The job detail page shows pertinent information about the job, including overall status as well as
|
||||
allocation statuses broken down by task group. It is similar to the `nomad status` CLI command.
|
||||
|
||||
Click on the `cache` task group to drill into the task group detail page. This page lists each allocation
|
||||
for the task group.
|
||||
|
||||
[![Nomad UI Task Group Detail][img-task-group-detail]][img-task-group-detail]
|
||||
|
||||
Click on the allocation in the allocations table. This page lists all tasks for an allocation as well
|
||||
as the recent events for each task. It is similar to the `nomad alloc status` command.
|
||||
|
||||
[![Nomad UI Alloc Status][img-alloc-status]][img-alloc-status]
|
||||
|
||||
The Nomad UI offers a friendly and visual alternative experience to the CLI.
|
||||
|
||||
## Next Steps
|
||||
|
||||
We've now concluded the getting started guide, however there are a number
|
||||
of [next steps](/intro/getting-started/next-steps) to get started with Nomad.
|
||||
|
||||
[img-jobs-list]: /img/intro-ui-jobs-list.png
|
||||
[img-job-detail]: /img/intro-ui-job-detail.png
|
||||
[img-task-group-detail]: /img/intro-ui-task-group-detail.png
|
||||
[img-alloc-status]: /img/intro-ui-alloc-status.png
|
||||
Reference in New Issue
Block a user