nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
Brendan MacDonell	26485c45a2	Add job_max_count option to keep Nomad server from running out of memory (#26858 ) If a Nomad job is started with a large number of instances (e.g. 4 billion), then the Nomad servers that attempt to schedule it will run out of memory and crash. While it's unlikely that anyone would intentionally schedule a job with 4 billion instances, we have occasionally run into issues with bugs in external automation. For example, an automated deployment system running on a test environment had an off-by-one error, and deployed a job with count = uint32(-1), causing the Nomad servers for that environment to run out of memory and crash. To prevent this, this PR introduces a job_max_count Nomad server configuration parameter. job_max_count limits the number of allocs that may be created from a job. The default value is 50000 - this is low enough that a job with the maximum possible number of allocs will not require much memory on the server, but is still much higher than the number of allocs in the largest Nomad job we have ever run.	2025-10-06 09:35:10 -04:00
James Rasell	80a26306bf	intro: Add node introduction flow for Nomad client registration. (#26405 ) This change implements the client -> server workflow for Nomad node introduction. A Nomad node can optionally be started with an introduction token, which is a signed JWT containing claims for the node registration. The server handles this according to the enforcement configuration. The introduction token can be provided by env var, cli flag, or by placing it within a default filesystem location. The latter option does not override the CLI or env var. The region claims has been removed from the initial claims set of the intro identity. This boundary is guarded by mTLS and aligns with the node identity.	2025-08-05 08:23:44 +01:00
James Rasell	7466dd71b2	server: Add new `server.client_introduction` config block. (#26315 ) The new configuration block exposes some key options which allow cluster administrators to control certain client introduction behaviours. This change introduces the new block and plumbing, so that it is exposed in the Nomad server for consumption via internal processes.	2025-07-22 08:50:19 +01:00
James Rasell	d5b2d5078b	rpc: Generate node identities with node RPC handlers when needed. (#26165 ) When a Nomad client register or re-registers, the RPC handler will generate and return a node identity if required. When an identity is generated, the signing key ID will be stored within the node object, to ensure a root key is not deleted until it is not used. During normal client operation it will periodically heartbeat to the Nomad servers to indicate aliveness. The RPC handler that is used for this action has also been updated to conditionally perform identity generation. Performing it here means no extra RPC handlers are required and we inherit the jitter in identity generation from the heartbeat mechanism. The identity generation check methods are performed from the RPC request arguments, so they a scoped to the required behaviour and can handle the nuance of each RPC. Failure to generate an identity is considered terminal to the RPC call. The client will include behaviour to retry this error which is always caused by the encrypter not being ready unless the servers keyring has been corrupted.	2025-07-01 16:07:21 +01:00
James Rasell	7a5f5750b0	test: Wait for client when enabled in test agent if possible. (#26129 ) When a test starts an agent and the client is enabled, we can wait until this reaches the ready state within the set up method. This mimics what we already do with leadership and the root keyring and should reduce flakey tests where it assume the client is ready as soon as the set up function returns, which is not guaranteed. The change exposed a couple of TLS reload tests which were not using the test agent correctly. They were setting up a client even though it would never be able to join the cluster due to TLS configuration issues. These have been fixed.	2025-06-25 10:00:28 +01:00
James Rasell	5a157eb123	server: Validate config num schedulers is between 0 and num CPUs. (#25441 ) The `server.num_scheduler` configuration value should be a value between 0 and the number of CPUs on the machine. The Nomad agent was not validating the configuration parameter which meant you could use a negative value or a value much larger than the available machine CPUs. This change enforces validation of the configuration value both on server startup and when the agent is reloaded. The Nomad API was only performing negative value validation when updating the scheduler number via this method. This change adds to the validation to ensure the number is not greater than the CPUs on the machine.	2025-03-20 07:29:57 +00:00
James Rasell	7268053174	vault: Remove legacy token based authentication workflow. (#25155 ) The legacy workflow for Vault whereby servers were configured using a token to provide authentication to the Vault API has now been removed. This change also removes the workflow where servers were responsible for deriving Vault tokens for Nomad clients. The deprecated Vault config options used byi the Nomad agent have all been removed except for "token" which is still in use by the Vault Transit keyring implementation. Job specification authors can no longer use the "vault.policies" parameter and should instead use "vault.role" when not using the default workload identity. --------- Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>	2025-02-28 07:40:02 +00:00
James Rasell	7d48aa2667	client: emit optional telemetry from prerun and prestart hooks. (#24556 ) The Nomad client can now optionally emit telemetry data from the prerun and prestart hooks. This allows operators to monitor and alert on failures and time taken to complete. The new datapoints are: - nomad.client.alloc_hook.prerun.success (counter) - nomad.client.alloc_hook.prerun.failed (counter) - nomad.client.alloc_hook.prerun.elapsed (sample) - nomad.client.task_hook.prestart.success (counter) - nomad.client.task_hook.prestart.failed (counter) - nomad.client.task_hook.prestart.elapsed (sample) The hook execution time is useful to Nomad engineering and will help optimize code where possible and understand job specification impacts on hook performance. Currently only the PreRun and PreStart hooks have telemetry enabled, so we limit the number of new metrics being produced.	2024-12-12 14:43:14 +00:00
Daniel Bennett	2f5cf8efae	networking: option to enable ipv6 on bridge network (#23882 ) by setting bridge_network_subnet_ipv6 in client config Co-authored-by: Martina Santangelo <martina.santangelo@hashicorp.com>	2024-09-04 10:17:10 -05:00
Mike Nomitch	31f4296826	Adds support for failures before warning to Consul service checks (#19336 ) Adds support for failures before warning and failures before critical to the automatically created Nomad client and server services in Consul	2023-12-14 11:33:31 -08:00
Tim Gross	9d075c44b2	config: remove old Vault/Consul config blocks from parser (#18997 ) Remove the now-unused original configuration blocks for Consul and Vault from the agent configuration parsing. When the agent needs to refer to a Consul or Vault block it will always be for a specific cluster for the task/service (or the default cluster for the agent's own use). This is third of three changesets for this work. Fixes: https://github.com/hashicorp/nomad/issues/18947 Ref: https://github.com/hashicorp/nomad/pull/18991 Ref: https://github.com/hashicorp/nomad/pull/18994	2023-11-08 09:30:08 -05:00
hashicorp-copywrite[bot]	a9d61ea3fd	Update copyright file headers to BUSL-1.1	2023-08-10 17:27:29 -05:00
Seth Hoenig	a4cc76bd3e	numa: enable numa topology detection (#18146 ) * client: refactor cgroups management in client * client: fingerprint numa topology * client: plumb numa and cgroups changes to drivers * client: cleanup task resource accounting * client: numa client and config plumbing * lib: add a stack implementation * tools: remove ec2info tool * plugins: fixup testing for cgroups / numa changes * build: update makefile and package tests and cl	2023-08-10 17:05:30 -05:00
Lance Haig	7e93f150b5	cli: tls certs not created with correct SANs (#16959 ) The `nomad tls cert` command did not create certificates with the correct SANs for them to work with non default domain and region names. This changset updates the code to support non default domains and regions in the certificates.	2023-05-22 09:31:56 -04:00
Seth Hoenig	2c44cbb001	api: enable support for setting original job source (#16763 ) * api: enable support for setting original source alongside job This PR adds support for setting job source material along with the registration of a job. This includes a new HTTP endpoint and a new RPC endpoint for making queries for the original source of a job. The HTTP endpoint is /v1/job/<id>/submission?version=<version> and the RPC method is Job.GetJobSubmission. The job source (if submitted, and doing so is always optional), is stored in the job_submission memdb table, separately from the actual job. This way we do not incur overhead of reading the large string field throughout normal job operations. The server config now includes job_max_source_size for configuring the maximum size the job source may be, before the server simply drops the source material. This should help prevent Bad Things from happening when huge jobs are submitted. If the value is set to 0, all job source material will be dropped. * api: avoid writing var content to disk for parsing * api: move submission validation into RPC layer * api: return an error if updating a job submission without namespace or job id * api: be exact about the job index we associate a submission with (modify) * api: reword api docs scheduling * api: prune all but the last 6 job submissions * api: protect against nil job submission in job validation * api: set max job source size in test server * api: fixups from pr	2023-04-11 08:45:08 -05:00
hashicorp-copywrite[bot]	f005448366	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Lance Haig	99f43c1144	Update ioutil library references to os and io respectively for command (#16329 ) No user facing changes so I assume no change log is required	2023-03-08 09:20:04 -06:00
Alessio Perugini	365ccf4377	Allow configurable range of Job priorities (#16084 )	2023-02-17 09:23:13 -05:00
Tim Gross	b79d00abf3	implement pre-forwarding auth on select RPCs (#15513 ) In #15417 we added a new `Authenticate` method to the server that returns an `AuthenticatedIdentity` struct. This changeset implements this method for a small number of RPC endpoints that together represent all the various ways in which RPCs are sent, so that we can validate that we're happy with this approach.	2023-01-24 10:52:07 -05:00
Charlie Voiselle	85f67d4a83	Add raft snapshot configuration options (#15522 ) * Add config elements * Wire in snapshot configuration to raft * Add hot reload of raft config * Add documentation for new raft settings * Add changelog	2023-01-20 14:21:51 -05:00
Tim Gross	e886d5d055	vault: detect namespace change in config reload (#14298 ) The `namespace` field was not included in the equality check between old and new Vault configurations, which meant that a Vault config change that only changed the namespace would not be detected as a change and the clients would not be reloaded. Also, the comparison for boolean fields such as `enabled` and `allow_unauthenticated` was on the pointer and not the value of that pointer, which results in spurious reloads in real config reload that is easily missed in typical test scenarios. Includes a minor refactor of the order of fields for `Copy` and `Merge` to match the struct fields in hopes it makes it harder to make this mistake in the future, as well as additional test coverage.	2022-08-24 17:03:29 -04:00
Michael Schurter	01648e615a	client: fix data races in config handling (#14139 ) Before this change, Client had 2 copies of the config object: config and configCopy. There was no guidance around which to use where (other than configCopy's comment to pass it to alloc runners), both are shared among goroutines and mutated in data racy ways. At least at one point I think the idea was to have `config` be mutable and then grab a lock to overwrite `configCopy`'s pointer atomically. This would have allowed alloc runners to read their config copies in data race safe ways, but this isn't how the current implementation worked. This change takes the following approach to safely handling configs in the client: 1. `Client.config` is the only copy of the config and all access must go through the `Client.configLock` mutex 2. Since the mutex only protects the config pointer itself and not fields inside the Config struct: all config mutation must be done on a copy of the config, and then Client's config pointer is overwritten while the mutex is acquired. Alloc runners and other goroutines with the old config pointer will not see config updates. 3. Deep copying is implemented on the Config struct to satisfy the previous approach. The TLS Keyloader is an exception because it has its own internal locking to support mutating in place. An unfortunate complication but one I couldn't find a way to untangle in a timely fashion. 4. To facilitate deep copying I made an internally backward incompatible API change: our `helper/funcs` used to turn containers (slices and maps) with 0 elements into nils. This probably saves a few memory allocations but makes it very easy to cause panics. Since my new config handling approach uses more copying, it became very difficult to ensure all code that used containers on configs could handle nils properly. Since this code has caused panics in the past, I fixed it: nil containers are copied as nil, but 0-element containers properly return a new 0-element container. No more "downgrading to nil!"	2022-08-18 16:32:04 -07:00
Piotr Kazmierczak	c4be2c6078	cleanup: replace TypeToPtr helper methods with pointer.Of (#14151 ) Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.	2022-08-17 18:26:34 +02:00
Tim Gross	e4cceab4f0	fix flaky `TestAgent_ProxyRPC_Dev` test (#13925 ) This test is a fairly trivial test of the agent RPC, but the test setup waits for a short fixed window after the node starts to send the RPC. After looking at detailed logs for recent test failures, it looks like the node registration for the first node doesn't get a chance to happen before we make the RPC call. Use `WaitForResultUntil` to give the test more time to run in slower test environments, while allowing it to finish quickly if possible.	2022-07-28 14:47:15 -04:00
Luiz Aoqui	d456cc1e7f	Track plan rejection history and automatically mark clients as ineligible (#13421 ) Plan rejections occur when the scheduler work and the leader plan applier disagree on the feasibility of a plan. This may happen for valid reasons: since Nomad does parallel scheduling, it is expected that different workers will have a different state when computing placements. As the final plan reaches the leader plan applier, it may no longer be valid due to a concurrent scheduling taking up intended resources. In these situations the plan applier will notify the worker that the plan was rejected and that they should refresh their state before trying again. In some rare and unexpected circumstances it has been observed that workers will repeatedly submit the same plan, even if they are always rejected. While the root cause is still unknown this mitigation has been put in place. The plan applier will now track the history of plan rejections per client and include in the plan result a list of node IDs that should be set as ineligible if the number of rejections in a given time window crosses a certain threshold. The window size and threshold value can be adjusted in the server configuration. To avoid marking several nodes as ineligible at one, the operation is rate limited to 5 nodes every 30min, with an initial burst of 10 operations.	2022-07-12 18:40:20 -04:00
Seth Hoenig	f1cafd0789	core: remove support for raft protocol version 2 This PR checks server config for raft_protocol, which must now be set to 3 or unset (0). When unset, version 3 is used as the default.	2022-06-23 14:37:50 +00:00
Eng Zer Jun	fca4ee8e05	test: use `T.TempDir` to create temporary test directory (#12853 ) * test: use `T.TempDir` to create temporary test directory This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The directory created by `t.TempDir` is automatically removed when the test and all its subtests complete. Prior to this commit, temporary directory created using `ioutil.TempDir` needs to be removed manually by calling `os.RemoveAll`, which is omitted in some tests. The error handling boilerplate e.g. defer func() { if err := os.RemoveAll(dir); err != nil { t.Fatal(err) } } is also tedious, but `t.TempDir` handles this for us nicely. Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: fix TestLogmon_Start_restart on Windows Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: fix failing TestConsul_Integration t.TempDir fails to perform the cleanup properly because the folder is still in use testing.go:967: TempDir RemoveAll cleanup: unlinkat /tmp/TestConsul_Integration2837567823/002/191a6f1a-5371-cf7c-da38-220fe85d10e5/web/secrets: device or resource busy Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-05-12 11:42:40 -04:00
James Rasell	d49cf2388a	Merge branch 'main' into f-1.3-boogie-nights	2022-03-23 09:41:25 +01:00
Seth Hoenig	b242957990	ci: swap ci parallelization for unconstrained gomaxprocs	2022-03-15 12:58:52 -05:00
James Rasell	b78e436f10	config: add native service discovery admin boolean parameter.	2022-03-14 11:48:13 +01:00
Kevin Schoonover	0873e08af1	agent: support multiple http address in addresses.http (#11582 )	2022-01-03 09:33:53 -05:00
Charlie Voiselle	4cfc6a0ed6	Making RPC Upgrade mode reloadable. (#11144 ) - Making RPC Upgrade mode reloadable. - Add suggestions from code review - remove spurious comment - switch to require(t,...) form for test. - Add to changelog	2021-11-01 16:30:53 -04:00
Matt Mukerjee	0881b94201	Add FailoverHeartbeatTTL to config (#11127 ) FailoverHeartbeatTTL is the amount of time to wait after a server leader failure before considering reallocating client tasks. This TTL should be fairly long as the new server leader needs to rebuild the entire heartbeat map for the cluster. In deployments with a small number of machines, the default TTL (5m) may be unnecessary long. Let's allow operators to configure this value in their config files.	2021-10-06 18:48:12 -04:00
James Rasell	a4156c3e94	tests: remove duplicate import statements.	2021-06-11 09:39:22 +02:00
Nick Ethier	d34495e6e0	agent: add test for reserved core config mapping	2021-04-13 13:28:15 -04:00
Drew Bailey	3347b40d11	remove event durability (#9147 ) * remove event durability temporarily removing go-memdb event durability until a new strategy is developed on how to best handled increased durability needs * drop events table schema and state store methods * fix neweventbuffer invocations	2020-10-22 12:21:03 -04:00
Drew Bailey	3c15f41411	filter on additional filter keys, remove switch statement duplication properly wire up durable event count move newline responsibility moves newline creation from NDJson to the http handler, json stream only encodes and sends now ignore snapshot restore if broker is disabled enable dev mode to access event steam without acl use mapping instead of switch use pointers for config sizes, remove unused ttl, simplify closed conn logic	2020-10-14 14:14:33 -04:00
Drew Bailey	e7e6df394f	wire up enable_event_publisher	2020-10-14 12:44:38 -04:00
Chris Baker	797543ad4b	removed backwards-compatible/untagged metrics deprecated in 0.7	2020-10-13 20:18:39 +00:00
Lars Lehtonen	3f6475f262	command/agent: fix dropped test error (#8504 )	2020-07-22 15:06:35 -04:00
Mahmood Ali	a9cf263888	implement raft multiplier	2020-05-31 12:24:27 -04:00
Mahmood Ali	5078e0cfed	tests and some clean up	2020-05-01 13:13:30 -04:00
Drew Bailey	5751ba6d16	add in change missed from ent	2020-03-25 10:53:38 -04:00
Drew Bailey	dc7e0bae77	add auditor	2020-03-25 10:48:23 -04:00
Mahmood Ali	4bbde0ea33	tests: test agent to use a noop auditor	2020-03-25 08:45:44 -04:00
Mahmood Ali	e812954bd9	Simplify Bootstrap logic in tests This change updates tests to honor `BootstrapExpect` exclusively when forming test clusters and removes test only knobs, e.g. `config.DevDisableBootstrap`. Background: Test cluster creation is fragile. Test servers don't follow the BootstapExpected route like production clusters. Instead they start as single node clusters and then get rejoin and may risk causing brain split or other test flakiness. The test framework expose few knobs to control those (e.g. `config.DevDisableBootstrap` and `config.Bootstrap`) that control whether a server should bootstrap the cluster. These flags are confusing and it's unclear when to use: their usage in multi-node cluster isn't properly documented. Furthermore, they have some bad side-effects as they don't control Raft library: If `config.DevDisableBootstrap` is true, the test server may not immediately attempt to bootstrap a cluster, but after an election timeout (~50ms), Raft may force a leadership election and win it (with only one vote) and cause a split brain. The knobs are also confusing as Bootstrap is an overloaded term. In BootstrapExpect, we refer to bootstrapping the cluster only after N servers are connected. But in tests and the knobs above, it refers to whether the server is a single node cluster and shouldn't wait for any other server. Changes: This commit makes two changes: First, it relies on `BootstrapExpected` instead of `Bootstrap` and/or `DevMode` flags. This change is relatively trivial. Introduce a `Bootstrapped` flag to track if the cluster is bootstrapped. This allows us to keep `BootstrapExpected` immutable. Previously, the flag was a config value but it gets set to 0 after cluster bootstrap completes.	2020-03-02 13:47:43 -05:00
Michael Schurter	e3e1f5cb53	core: add limits to unauthorized connections Introduce limits to prevent unauthorized users from exhausting all ephemeral ports on agents: * `{https,rpc}_handshake_timeout` * `{http,rpc}_max_conns_per_client` The handshake timeout closes connections that have not completed the TLS handshake by the deadline (5s by default). For RPC connections this timeout also separately applies to first byte being read so RPC connections with TLS enabled have `rpc_handshake_time * 2` as their deadline. The connection limit per client prevents a single remote TCP peer from exhausting all ephemeral ports. The default is 100, but can be lowered to a minimum of 26. Since streaming RPC connections create a new TCP connection (until MultiplexV2 is used), 20 connections are reserved for Raft and non-streaming RPCs to prevent connection exhaustion due to streaming RPCs. All limits are configurable and may be disabled by setting them to `0`. This also includes a fix that closes connections that attempt to create TLS RPC connections recursively. While only users with valid mTLS certificates could perform such an operation, it was added as a safeguard to prevent programming errors before they could cause resource exhaustion.	2020-01-30 10:38:25 -08:00
Mahmood Ali	31025d6cac	Support customizing full scheduler config	2020-01-28 14:51:42 -05:00
Mahmood Ali	744c9a485d	scheduler: allow configuring default preemption for system scheduler Some operators want a greater control over when preemption is enabled, especially during an upgrade to limit potential side-effects.	2020-01-13 08:30:49 -05:00
Lars Lehtonen	101d0373ee	command/agent: TestAgent_ServerConfig() fix dropped errors (#6659 )	2019-11-11 09:46:46 -05:00

1 2 3

121 Commits