If a Nomad job is started with a large number of instances (e.g. 4 billion),
then the Nomad servers that attempt to schedule it will run out of memory and
crash. While it's unlikely that anyone would intentionally schedule a job with 4
billion instances, we have occasionally run into issues with bugs in external
automation. For example, an automated deployment system running on a test
environment had an off-by-one error, and deployed a job with count = uint32(-1),
causing the Nomad servers for that environment to run out of memory and crash.
To prevent this, this PR introduces a job_max_count Nomad server configuration
parameter. job_max_count limits the number of allocs that may be created from a
job. The default value is 50000 - this is low enough that a job with the maximum
possible number of allocs will not require much memory on the server, but is
still much higher than the number of allocs in the largest Nomad job we have
ever run.
* Add preserve-resources flag when registering a job
* Add preserve-resources flag to website docs
* Add changelog
* Update tests, docs
* Preserve counts & resources in fsm
* Update doc
* Update preservation of resources/count to happen in StateStore
ACL policies are parsed when creating, updating, or compiling the
resulting ACL object when used. This parsing was silently ignoring
duplicate singleton keys, or invalid keys which does not grant any
additional access, but is a poor UX and can be unexpected.
This change parses all new policy writes and updates, so that
duplicate or invalid keys return an error to the caller. This is
called strict parsing. In order to correctly handle upgrades of
clusters which have existing policies that would fall foul of the
change, a lenient parsing mode is also available. This allows
the policy to continue to be parsed and compiled after an upgrade
without the need for an operator to correct the policy document
prior to further use.
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Expand on the documentation of allocation garbage collection:
* Explain that server-side GC of allocations is tied to the GC of the
evaluation that spawned the allocation.
* Explain that server-side GC of allocations will force them to be immediately
GC'd on the client regardless of the client-side configurations.
Ref: https://github.com/hashicorp/nomad/issues/26765
Co-authored-by: Aimee Ukasick <Aimee.Ukasick@ibm.com>
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
The metrics on the eval broker include labels for the job ID, but under a high
volume of dispatch workloads, this results in excessive heap usage on the
leader. Dispatch workloads should use their parent ID rather than their child ID
for any metrics we collect.
Also, eliminate an extra copy of the labels. And remove the extremely high
cardinality `"eval_id"` label from the `nomad.broker.eval_waiting` metric.
Fixes: https://github.com/hashicorp/nomad/issues/26657
When configuring Nomad Enterprise with Consul Enterprise and multiple
namespaces, you need to include the `consul_namespace` mapping in the auth
method configuration. Otherwise you'll see an error like "unknown variable
accessed: value.consul_namespace". There's no example of the updated auth method
configuration you need, which makes this detail unclear when we're showing the
claim being used in the following `consul acl auth-method create` command.
Typically the `LOGNAME` environment variable should be set according
to the values within `/etc/passwd` and represents the name of the
logged in user. This should be set, where possible, alongside the
USER and HOME variables for all drivers that use the shared
executor and do not use a sub-shell.
don't require "bridge" network mode when using connect{}
we document this as "at your own risk" because CNI configuration
is so flexible that we can't guarantee a user's network will work,
but Nomad's "bridge" CNI config may be used as a reference.
Adds a new `windows` command which is available when running on
a Windows hosts. The command includes two new subcommands:
* `service install`
* `service uninstall`
The `service install` command will install the called binary into
the Windows program files directory, create a new Windows service,
setup configuration and data directories, and register the service
with the Window eventlog. If the service and/or binary already
exist, the service will be stopped, service and eventlog updated
if needed, binary replaced, and the service started again.
The `service uninstall` command will stop the service, remove the
Windows service, and deregister the service with the eventlog. It
will not remove the configuration/data directory nor will it remove
the installed binary.
This adds artifact inspection after download to detect any issues
with the content fetched. Currently this means checking for any
symlinks within the artifact that resolve outside the task or
allocation directories. On platforms where lockdown is available
(some Linux) this inspection is not performed.
The inspection can be disabled with the DisableArtifactInspection
option. A dedicated option for disabling this behavior allows
the DisableFilesystemIsolation option to be enabled but still
have artifacts inspected after download.
* docs: revert to labels={"foo.bar": "baz"} style
Back in #24074 I thought it was necessary to wrap labels in a list to
support quoted keys in hcl2. This... doesn't appear to be true at all?
The simpler `labels={...}` syntax appears to work just fine.
I updated the docs and a test (and modernized it a bit). I also switched
some other examples to the `labels = {}` format from the old `labels{}`
format.
* copywronged
* fmtd
The go-metrics library retains Prometheus metrics in memory until expiration,
but the expiration logic requires that the metrics are being regularly
scraped. If you don't have a Prometheus server scraping, this leads to
ever-increasing memory usage. In particular, high volume dispatch workloads emit
a large set of label values and if these are not eventually aged out the bulk of
Nomad server memory can end up consumed by metrics.
* fix(doc): fix links for task driver plugins
host URL was wrong, changed from develoepr to developer
* Update stateful-workloads.mdx
Fix link for Nomad event stream page
In https://github.com/hashicorp/nomad/issues/15459 we've had a bit of
back-and-forth as a result of applying Nomad environment variables where they
typically should not be used. Clarify that the env vars are for the CLI and
mostly not for the agent. Also move the `NOMAD_CLI_SHOW_HINTS` description into
the correct section.
The docs for the `template` block accurately describe the template configuration
default function denylist in the body but the default parameters are missing
values. The equivalent docs in the `client` configuration are missing
`executeTemplate` as well.
* Add -log-file-export and -log-lookback commands to add historical log to
debug capture
* use monitor.PrepFile() helper for other historical log tests
* Add MonitorExport command and handlers
* Implement autocomplete
* Require nomad in serviceName
* Fix race in StreamReader.Read
* Add and use framer.Flush() to coordinate function exit
* Add LogFile to client/Server config and read NomadLogPath in rpcHandler instead of HTTPServer
* Parameterize StreamFixed stream size
Affinities and contraints use similar feasibility checking logic to determine if
a given node matches (although affinities don't support all the same
operators). Most operators don't allow `value` to be unset. Update the docs to
reflect this.
Fixes: https://github.com/hashicorp/nomad/issues/24983
During the big docs rearchitecture, we split up the task driver pages into
separate job declaration and driver configuration pages. The link for the
`raw_exec` driver to the configuration page is a self-reference.
The documentation for CSI and DHV has a list of the available access modes, but
doesn't explain what they mean in terms of what jobs can request, the scheduler
behavior, or the CSI plugin behavior. Expand on the information available in the
CSI specification and provide a description of DHV's behavior as well.
Ref: https://github.com/container-storage-interface/spec/blob/master/spec.md#createvolume
The current autoscaler docs implies that it has minimal or non-working support
for Nomad namespaces. Whereas in fact the namespace support works fine but just
doesn't allow configuring multiple namespaces without using a wildcard (for
now). Make this more clear and fix the reference to the configuration "below",
which is no longer on that same page.
Ref: https://github.com/hashicorp/nomad-autoscaler/issues/65