Always wait 200ms before calling the Node.UpdateAlloc RPC to send allocation updates to servers. Prior to this change we only reset the update ticker when an error was encountered. This meant the 200ms ticker was running while the RPC was being performed. If the RPC was slow due to network latency or server load and took >=200ms, the ticker would tick during the RPC. Then on the next loop only the select would randomly choose between the two viable cases: receive an update or fire the RPC again. If the RPC case won it would immediately loop again due to there being no updates to send. When the update chan receive is selected a single update is added to the slice. The odds are then 50/50 that the subsequent loop will send the single update instead of receiving any more updates. This could cause a couple of problems: 1. Since only a small number of updates are sent, the chan buffer may fill, applying backpressure, and slowing down other client operations. 2. The small number of updates sent may already be stale and not represent the current state of the allocation locally. A risk here is that it's hard to reason about how this will interact with the 50ms batches on servers when the servers under load. A further improvement would be to completely remove the alloc update chan and instead use a mutex to build a map of alloc updates. I wanted to test the lowest risk possible change on loaded servers first before making more drastic changes.
Nomad

Overview
Nomad is an easy-to-use, flexible, and performant workload orchestrator that deploys:
Nomad enables developers to use declarative infrastructure-as-code for deploying their applications (jobs). Nomad uses bin packing to efficiently schedule jobs and optimize for resource utilization. Nomad is supported on macOS, Windows, and Linux.
Nomad is widely adopted and used in production by PagerDuty, CloudFlare, Roblox, Pandora, and more.
-
Deploy Containers and Legacy Applications: Nomad’s flexibility as an orchestrator enables an organization to run containers, legacy, and batch applications together on the same infrastructure. Nomad brings core orchestration benefits to legacy applications without needing to containerize via pluggable task drivers.
-
Simple & Reliable: Nomad runs as a single binary and is entirely self contained - combining resource management and scheduling into a single system. Nomad does not require any external services for storage or coordination. Nomad automatically handles application, node, and driver failures. Nomad is distributed and resilient, using leader election and state replication to provide high availability in the event of failures.
-
Device Plugins & GPU Support: Nomad offers built-in support for GPU workloads such as machine learning (ML) and artificial intelligence (AI). Nomad uses device plugins to automatically detect and utilize resources from hardware devices such as GPU, FPGAs, and TPUs.
-
Federation for Multi-Region, Multi-Cloud: Nomad was designed to support infrastructure at a global scale. Nomad supports federation out-of-the-box and can deploy applications across multiple regions and clouds.
-
Proven Scalability: Nomad is optimistically concurrent, which increases throughput and reduces latency for workloads. Nomad has been proven to scale to clusters of 10K+ nodes in real-world production environments.
-
HashiCorp Ecosystem: Nomad integrates seamlessly with Terraform, Consul, Vault for provisioning, service discovery, and secrets management.
Getting Started
Get started with Nomad quickly in a sandbox environment on the public cloud or on your computer.
- Local
- AWS
- Azure
- GCP
These methods are not meant for production.
Documentation & Guides
Documentation is available on the Nomad website here. Guides are available on HashiCorp Learn website here.
Resources
- Website
- Mailing List
- Gitter
Who Uses Nomad
- Roblox
- Cloudflare
- BetterHelp
- Navi Capital
- Trivago
- Reaktor
- Pandora
- CircleCI
- Q2
- Citadel
- Deluxe Entertainment
- Jet.com (Walmart)
- PagerDuty
- SAP Ariba
- Target
- Oscar Health
- eBay
- Dutch National Police
- N26
- Elsevier
- Graymeta
- NIH NCBI
- imgix
...and more!
Contributing
See the contributing directory for more developer documentation.
Developing with Vagrant
A development environment is supplied via Vagrant to make getting started easier.
- Install Vagrant
- Install Virtualbox
- Bring up the Vagrant project
$ git clone https://github.com/hashicorp/nomad.git $ cd nomad $ vagrant up
The virtual machine will launch, and a provisioning script will install the needed dependencies within the VM.
Developing without Vagrant
- Install Go 1.15.5+ (Note:
gcc-gois not supported) - Clone this repo
$ git clone https://github.com/hashicorp/nomad.git $ cd nomad - Bootstrap your environment
$ make bootstrap - (Optionally) Set a higher ulimit, as Nomad creates many file handles during normal operations
$ [ "$(ulimit -n)" -lt 1024 ] && ulimit -n 1024 - Verify you can run tests
$ make test
Running a development build
- Compile a development binary (see the UI README to include the web UI in the binary)
$ make dev # find the built binary at ./bin/nomad - Start the agent in dev mode
$ sudo bin/nomad agent -dev - (Optionally) Run Consul to enable service discovery and health checks
- Download Consul
- Start Consul in dev mode
$ consul agent -dev
Compiling Protobufs
If in the course of your development you change a Protobuf file (those ending in .proto), you'll need to recompile the protos.
- Install Buf
- Compile Protobufs
$ make proto
Building the Web UI
See the UI README for instructions.
Create a release binary
To create a release binary:
$ make prerelease
$ make release
$ ls ./pkg
This will generate all the static assets, compile Nomad for multiple
platforms and place the resulting binaries into the ./pkg directory.
API Compatibility
Only the api/ and plugins/ packages are intended to be imported by other projects. The root Nomad module does not follow semver and is not intended to be imported directly by other projects.