mirror of
https://github.com/kemko/nomad.git
synced 2026-01-04 17:35:43 +03:00
* Move commands from docs to its own root-level directory * temporarily use modified dev-portal branch with nomad ia changes * explicitly clone nomad ia exp branch * retrigger build, fixed dev-portal broken build * architecture, concepts and get started individual pages * fix get started section destinations * reference section * update repo comment in website-build.sh to show branch * docs nav file update capitalization * update capitalization to force deploy * remove nomad-vs-kubernetes dir; move content to what is nomad pg * job section * Nomad operations category, deploy section * operations category, govern section * operations - manage * operations/scale; concepts scheduling fix * networking * monitor * secure section * remote auth-methods folder and move up pages to sso; linkcheck * Fix install2deploy redirects * fix architecture redirects * Job section: Add missing section index pages * Add section index pages so breadcrumbs build correctly * concepts/index fix front matter indentation * move task driver plugin config to new deploy section * Finish adding full URL to tutorials links in nav * change SSO to Authentication in nav and file system * Docs NomadIA: Move tutorials into NomadIA branch (#26132) * Move governance and policy from tutorials to docs * Move tutorials content to job-declare section * run jobs section * stateful workloads * advanced job scheduling * deploy section * manage section * monitor section * secure/acl and secure/authorization * fix example that contains an unseal key in real format * remove images from sso-vault * secure/traffic * secure/workload-identities * vault-acl change unseal key and root token in command output sample * remove lines from sample output * fix front matter * move nomad pack tutorials to tools * search/replace /nomad/tutorials links * update acl overview with content from deleted architecture/acl * fix spelling mistake * linkcheck - fix broken links * fix link to Nomad variables tutorial * fix link to Prometheus tutorial * move who uses Nomad to use cases page; move spec/config shortcuts add dividers * Move Consul out of Integrations; move namespaces to govern * move integrations/vault to secure/vault; delete integrations * move ref arch to docs; rename Deploy Nomad back to Install Nomad * address feedback * linkcheck fixes * Fixed raw_exec redirect * add info from /nomad/tutorials/manage-jobs/jobs * update page content with newer tutorial * link updates for architecture sub-folders * Add redirects for removed section index pages. Fix links. * fix broken links from linkcheck * Revert to use dev-portal main branch instead of nomadIA branch * build workaround: add intro-nav-data.json with single entry * fix content-check error * add intro directory to get around Vercel build error * workound for emtpry directory * remove mdx from /intro/ to fix content-check and git snafu * Add intro index.mdx so Vercel build should work --------- Co-authored-by: Tu Nguyen <im2nguyen@gmail.com>
91 lines
4.9 KiB
Plaintext
91 lines
4.9 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: Architecture
|
|
description: |-
|
|
Nomad's system architecture supports a "run any workload anywhere" approach to scheduling and orchestration. Learn how regions, servers, and clients interact, and how Nomad clusters replicate data and run workloads.
|
|
---
|
|
|
|
# Architecture
|
|
|
|
This page provides conceptual information on Nomad architecture. Learn how regions, servers, and clients interact, and how Nomad clusters replicate data and run workloads.
|
|
|
|
Nomad is a complex system that has many different pieces. To help both users and
|
|
developers of Nomad build a mental model of how it works, this page documents
|
|
the system architecture.
|
|
|
|
Refer to the [glossary][] for more details on some of the terms discussed here.
|
|
|
|
~> **Advanced Topic!** This page covers technical details
|
|
of Nomad. You do not need to understand these details to
|
|
effectively use Nomad. The details are documented here for
|
|
those who wish to learn about them without having to go
|
|
spelunking through the source code.
|
|
|
|
## High-Level Overview
|
|
|
|
Looking at only a single region, at a high level Nomad looks like this:
|
|
|
|
[](/img/nomad-architecture-region.png)
|
|
|
|
Within each region, we have both clients and servers. Servers are responsible for
|
|
accepting jobs from users, managing clients, and [computing task placements](/nomad/docs/concepts/scheduling/how-scheduling-works).
|
|
Each region may have clients from multiple datacenters, allowing a small number of servers
|
|
to handle very large clusters.
|
|
|
|
In some cases, for either availability or scalability, you may need to run multiple
|
|
regions. Nomad supports federating multiple regions together into a single cluster.
|
|
At a high level, this setup looks like this:
|
|
|
|
[](/img/nomad-architecture-global.png)
|
|
|
|
Regions are fully independent from each other, and do not share jobs, clients, or
|
|
state. They are loosely-coupled using a gossip protocol, which allows users to
|
|
submit jobs to any region or query the state of any region transparently. Requests
|
|
are forwarded to the appropriate server to be processed and the results returned.
|
|
Data is _not_ replicated between regions.
|
|
|
|
The servers in each region are all part of a single consensus group. This means
|
|
that they work together to elect a single leader which has extra duties. The leader
|
|
is responsible for processing all queries and transactions. Nomad is optimistically
|
|
concurrent, meaning all servers participate in making scheduling decisions in parallel.
|
|
The leader provides the additional coordination necessary to do this safely and
|
|
to ensure clients are not oversubscribed.
|
|
|
|
Each region is expected to have either three or five servers. This strikes a balance
|
|
between availability in the case of failure and performance, as consensus gets
|
|
progressively slower as more servers are added. However, there is no limit to the number
|
|
of clients per region.
|
|
|
|
Clients are configured to communicate with their regional servers and communicate
|
|
using remote procedure calls (RPC) to register themselves, send heartbeats for liveness,
|
|
wait for new allocations, and update the status of allocations. A client registers
|
|
with the servers to provide the resources available, attributes, and installed drivers.
|
|
Servers use this information for scheduling decisions and create allocations to assign
|
|
work to clients.
|
|
|
|
Users make use of the Nomad CLI or API to submit jobs to the servers. A job represents
|
|
a desired state and provides the set of tasks that should be run. The servers are
|
|
responsible for scheduling the tasks, which is done by finding an optimal placement for
|
|
each task such that resource utilization is maximized while satisfying all constraints
|
|
specified by the job. Resource utilization is maximized by bin packing, in which
|
|
the scheduling tries to make use of all the resources of a machine without
|
|
exhausting any dimension. Job constraints can be used to ensure an application is
|
|
running in an appropriate environment. Constraints can be technical requirements based
|
|
on hardware features such as architecture and availability of GPUs, or software features
|
|
like operating system and kernel version, or they can be business constraints like
|
|
ensuring PCI compliant workloads run on appropriate servers.
|
|
|
|
## Getting in Depth
|
|
|
|
This has been a brief high-level overview of the architecture of Nomad. There
|
|
are more details available for each of the sub-systems. The [consensus protocol](/nomad/docs/architecture/cluster/consensus),
|
|
[gossip protocol](/nomad/docs/architecture/security/gossip), and [scheduler design](/nomad/docs/concepts/scheduling/how-scheduling-works)
|
|
are all documented in more detail.
|
|
|
|
For other details, either consult the code, [open an issue on
|
|
GitHub][gh_issue], or ask a question in the [community forum][forum].
|
|
|
|
[gh_issue]: https://github.com/hashicorp/nomad/issues/new/choose
|
|
[forum]: https://discuss.hashicorp.com/c/nomad
|
|
[glossary]: /nomad/docs/glossary
|