nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-01 16:05:42 +03:00

Author	SHA1	Message	Date
Tim Gross	f00bff09f1	fix multiple overflow errors in exponential backoff (#18200 ) We use capped exponential backoff in several places in the code when handling failures. The code we've copy-and-pasted all over has a check to see if the backoff is greater than the limit, but this check happens after the bitshift and we always increment the number of attempts. This causes an overflow with a fairly small number of failures (ex. at one place I tested it occurs after only 24 iterations), resulting in a negative backoff which then never recovers. The backoff becomes a tight loop consuming resources and/or DoS'ing a Nomad RPC handler or an external API such as Vault. Note this doesn't occur in places where we cap the number of iterations so the loop breaks (usually to return an error), so long as the number of iterations is reasonable. Introduce a helper with a check on the cap before the bitshift to avoid overflow in all places this can occur. Fixes: #18199 Co-authored-by: stswidwinski <stan.swidwinski@gmail.com>	2023-08-15 14:38:18 -04:00
hashicorp-copywrite[bot]	2d35e32ec9	Update copyright file headers to BUSL-1.1	2023-08-10 17:27:15 -05:00
hashicorp-copywrite[bot]	f005448366	[COMPLIANCE] Add Copyright and License Headers	2023-04-10 15:36:59 +00:00
Jerome Gravel-Niquet	6789f7ad61	print the actual fingerprint error instead of an unrelated (and probably nil) error	2021-01-04 08:20:29 -05:00
Kris Hicks	85ed8ddd4f	Add gosimple linter (#9590 )	2020-12-09 11:05:18 -08:00
Jerome Gravel-Niquet	66ddf62931	Don't ignore nil devices in plugin fingerprint Even if a plugin sends back an empty `[]device.DeviceGroup`, it's transformed to `nil` during the RPC. Our custom device plugin is returning empty `FingerprintResponse.Devices` very often. Our temporary fix is to send a dummy `DeviceGroup` if the slice is empty. This has the effect of never triggering the "first fingerprint" and therefore timing out after 50s. In turn, this made our node exceed its hearbeat grace period when restarting it, revoking all vault tokens for its allocations, causing a restart of all our allocations because the token couldn't be renewed. Removing the logic for `f.Devices == nil` does not appear to affect the functionality of the function.	2020-11-10 16:04:22 -05:00
Mahmood Ali	60dd7aecc9	nvidia: support disabling the nvidia plugin (#8353 )	2020-07-21 10:11:16 -04:00
Michael Schurter	158c74887e	goimports until make check is happy	2019-01-23 06:27:14 -08:00
Michael Schurter	0d61ff0fb9	move pluginutils -> helper/pluginutils I wanted a different color bikeshed, so I get to paint it	2019-01-22 15:50:08 -08:00
Alex Dadgar	c19cd2e5cf	loader and singleton	2019-01-22 15:11:57 -08:00
Alex Dadgar	437f03d877	recover	2019-01-07 14:49:40 -08:00
Alex Dadgar	ed4f8eac6e	Add plugin API versioning to plugin loader and plugins	2018-12-18 16:48:00 -08:00
Alex Dadgar	ad4c26a1e3	review comments	2018-11-07 11:31:52 -08:00
Alex Dadgar	57f40c7e3e	Device manager Introduce a device manager that manages the lifecycle of device plugins on the client. It fingerprints, collects stats, and forwards Reserve requests to the correct plugin. The manager, also handles device plugins failing and validates their output.	2018-11-07 10:43:15 -08:00

14 Commits