From 8e8975bf9f38d52056af1178eb2e18cd46a3f42f Mon Sep 17 00:00:00 2001 From: Clint Shryock Date: Thu, 24 Sep 2015 10:50:20 -0500 Subject: [PATCH] doc updates --- .../docs/internals/architecture.html.md | 51 ++++++++++--------- website/source/docs/internals/index.html.md | 2 +- 2 files changed, 28 insertions(+), 25 deletions(-) diff --git a/website/source/docs/internals/architecture.html.md b/website/source/docs/internals/architecture.html.md index ffefe87c1..94311de24 100644 --- a/website/source/docs/internals/architecture.html.md +++ b/website/source/docs/internals/architecture.html.md @@ -23,28 +23,40 @@ Before describing the architecture, we provide a glossary of terms to help clarify what is being discussed: * **Job** - A Job is a specification provided by users that declares a workload for - Nomad. A Job is a form of _desired state_, the user is expressing that the job should + Nomad. A Job is a form of _desired state_; the user is expressing that the job should be running, but not where it should be run. The responsibility of Nomad is to make sure the _actual state_ matches the user desired state. A Job is composed of one or more task groups. * **Task Group** - A Task Group is a set of tasks that must be run together. For example, a web server may require that a log shipping co-process is always running as well. A task - group is the unit of scheduling, meaning the entire group must run on a client node and + group is the unit of scheduling, meaning the entire group must run on the same client node and cannot be split. +* **Driver** – A Driver represents the basic means of executing your **Tasks**. + Example Drivers include Docker, Qemu, Java, and static binaries. + * **Task** - A Task is the smallest unit of work in Nomad. Tasks are executed by drivers, - which allow Nomad to be flexible in the types of tasks it supports. Examples include Docker, - Qemu, Java and static binaries. Tasks specify their driver, configuration for the driver, - constraints, and resources required. + which allow Nomad to be flexible in the types of tasks it supports. Tasks + specify their driver, configuration for the driver, constraints, and resources required. * **Client** - A Client of Nomad is a machine that tasks can be run on. All clients run the Nomad agent. The agent is responsible for registering with the servers, watching for any work to be assigned and executing tasks. The Nomad agent is a long lived process which interfaces with the servers. +* **Allocation** - An Allocation is a mapping between a task group in a job and a client + node. A single job may have hundreds or thousands of task groups, meaning an equivalent + number of allocations must exist to map the work to client machines. Allocations are created + by the Nomad servers as part of scheduling decisions made during an evaluation. + +* **Evaluation** - Evaluations are the mechanism by which Nomad makes scheduling decisions. + When either the _desired state_ (jobs) or _actual state_ (clients) changes, Nomad creates + a new evaluation to determine if any actions must be taken. An evaluation may result + in changes to allocations if necessary. + * **Server** - Nomad servers are the brains of the cluster. There is a cluster of servers - per region and they manage all jobs and clients, run evalutations and create task allocations. + per region and they manage all jobs and clients, run evaluations, and create task allocations. The servers replicate data between each other and perform leader election to ensure high availability. Servers federate across regions to make Nomad globally aware. @@ -55,16 +67,6 @@ clarify what is being discussed: have a "US" region with the "us-east-1" and "us-west-1" datacenters, connected to the "EU" region with the "eu-fr-1" and "eu-uk-1" datacenters. -* **Evaluation** - Evaluations are the mechanism by which Nomad makes scheduling decisions. - When either the _desired state_ (jobs) or _actual state_ (clients) changes, Nomad creates - a new evaluation to determine if any actions must be taken. An evaluation may result - in changes to allocations if necessary. - -* **Allocation** - An Allocation is a mapping between a task group in a job, and a client - node. A single job may have hundreds or thousands of task groups, meaning an equivalent - number of allocations must exist to map the work to client machines. Allocations are created - by the Nomad servers as part of scheduling decisions made during an evaluation. - * **Bin Packing** - Bin Packing is the process of filling bins with items in a way that maximizes the utilization of bins. This extends to Nomad, where the clients are "bins" and the items are task groups. Nomad optimizes resources by efficiently bin packing @@ -72,7 +74,7 @@ clarify what is being discussed: # High-Level Overview -Looking at only a single region, at a high level Nomad looks like: +Looking at only a single region, at a high level Nomad looks like this: [![Regional Architecture](/assets/images/nomad-architecture-region.png)](/assets/images/nomad-architecture-region.png) @@ -83,18 +85,18 @@ to handle very large clusters. In some cases, for either availability or scalability, you may need to run multiple regions. Nomad supports federating multiple regions together into a single cluster. -At a high level, this looks like: +At a high level, this setup looks like this: [![Global Architecture](/assets/images/nomad-architecture-global.png)](/assets/images/nomad-architecture-global.png) -Regions are fully independent from each other, and do not share jobs, clients or +Regions are fully independent from each other, and do not share jobs, clients, or state. They are loosely-coupled using a gossip protocol, which allows users to submit jobs to any region or query the state of any region transparently. Requests are forwarded to the appropriate server to be processed and the results returned. The servers in each datacenter are all part of a single consensus group. This means that they work together to elect a single leader which has extra duties. The leader -is responsible for processing all queries and transactions. Nomad is an optimistically +is responsible for processing all queries and transactions. Nomad is optimistically concurrent, meaning all servers participate in making scheduling decisions in parallel. The leader provides the additional coordination necessary to do this safely and to ensure clients are not oversubscribed. @@ -105,19 +107,19 @@ progressively slower as more servers are added. However, there is no limit to th of clients per region. Clients are configured to communicate with their regional servers and communicate -using remote procedure calls (RPC) to register themselves, heartbeat for liveness, +using remote procedure calls (RPC) to register themselves, send heartbeats for liveness, wait for new allocations, and update the status of allocations. A client registers with the servers to provide the resources available, attributes, and installed drivers. Servers use this information for scheduling decisions and create allocations to assign work to clients. Users make use of the Nomad CLI or API to submit jobs to the servers. A job represents -a desired state, providing the set of tasks that should be run. The servers are -responsible for scheduling the tasks, which is done by finding an optimial placement for +a desired state and provides the set of tasks that should be run. The servers are +responsible for scheduling the tasks, which is done by finding an optimal placement for each task such that resource utilization is maximized while satisfying all constraints specified by the job. Resource utilization is maximized by bin packing, in which the scheduling tries to make use of all the resources of a machine without -exhausing any dimension. Job constraints can be used to ensure an application is +exhausting any dimension. Job constraints can be used to ensure an application is running in an appropriate environment. Constraints can be technical requirements based on hardware features such as architecture, availability of GPUs, or software features like operating system and kernel version, or they can be business constraints like @@ -131,3 +133,4 @@ are more details available for each of the sub-systems. The [scheduler design](/ are all documented in more detail. For other details, either consult the code, ask in IRC or reach out to the mailing list. + diff --git a/website/source/docs/internals/index.html.md b/website/source/docs/internals/index.html.md index bbaf136c3..6386bd057 100644 --- a/website/source/docs/internals/index.html.md +++ b/website/source/docs/internals/index.html.md @@ -9,7 +9,7 @@ description: |- # Nomad Internals This section covers the internals of Nomad and explains the technical -details of how Nomad functions, its architecture and sub-systems. +details of how Nomad functions, its architecture, and sub-systems. -> **Note:** Knowledge of Nomad internals is not required to use Nomad. If you aren't interested in the internals