Commit Graph

53 Commits

Author SHA1 Message Date
Tim Gross
b25f1b66ce resources: allow job authors to configure size of secrets tmpfs (#23696)
On supported platforms, the secrets directory is a 1MiB tmpfs. But some tasks
need larger space for downloading large secrets. This is especially the case for
tasks using `templates`, which need extra room to write a temporary file to the
secrets directory that gets renamed to the old file atomically.

This changeset allows increasing the size of the tmpfs in the `resources`
block. Because this is a memory resource, we need to include it in the memory we
allocate for scheduling purposes. The task is already prevented from using more
memory in the tmpfs than the `resources.memory` field allows, but can bypass
that limit by writing to the tmpfs via `template` or `artifact` blocks.

Therefore, we need to account for the size of the tmpfs in the allocation
resources. Simply adding it to the memory needed when we create the allocation
allows it to be accounted for in all downstream consumers, and then we'll
subtract that amount from the memory resources just before configuring the task
driver.

For backwards compatibility, the default value of 1MiB is "free" and ignored by
the scheduler. Otherwise we'd be increasing the allocated resources for every
existing alloc, which could cause problems across upgrades. If a user explicitly
sets `resources.secrets = 1` it will no longer be free.

Fixes: https://github.com/hashicorp/nomad/issues/2481
Ref: https://hashicorp.atlassian.net/browse/NET-10070
2024-08-05 16:06:58 -04:00
Luiz Aoqui
9d4f7bcb68 mock_driver: fix fingreprint key (#20351)
The `mock_driver` is an internal task driver used mostly for testing and
simulating workloads. During the allocrunner v2 work (#4792) its name
changed from `mock_driver` to just `mock` and then back to
`mock_driver`, but the fingreprint key was kept as `driver.mock`.

This results in tasks configured with `driver = "mock"` to be scheduled
(because Nomad thinks the client has a task driver called `mock`), but
fail to actually run (because the Nomad client can't find a driver
called `mock` in its catalog).

Fingerprinting the right name prevents the job from being scheduled in
the first place.

Also removes mentions of the mock driver from documentation since its an
internal driver and not available in any production release.
2024-04-16 07:16:55 +01:00
Seth Hoenig
05937ab75b exec2: add client support for unveil filesystem isolation mode (#20115)
* exec2: add client support for unveil filesystem isolation mode

This PR adds support for a new filesystem isolation mode, "Unveil". The
mode introduces a "alloc_mounts" directory where tasks have user-owned
directory structure which are bind mounts into the real alloc directory
structure. This enables a task driver to use landlock (and maybe the
real unveil on openbsd one day) to isolate a task to the task owned
directory structure, providing sandboxing.

* actually create alloc-mounts-dir directory

* fix doc strings about alloc mount dir paths
2024-03-13 08:24:17 -05:00
hashicorp-copywrite[bot]
2d35e32ec9 Update copyright file headers to BUSL-1.1 2023-08-10 17:27:15 -05:00
Tim Gross
88323bab4a allocrunner: provide factory function so we can build mock ARs (#17161)
Tools like `nomad-nodesim` are unable to implement a minimal implementation of
an allocrunner so that we can test the client communication without having to
lug around the entire allocrunner/taskrunner code base. The allocrunner was
implemented with an interface specifically for this purpose, but there were
circular imports that made it challenging to use in practice.

Move the AllocRunner interface into an inner package and provide a factory
function type. Provide a minimal test that exercises the new function so that
consumers have some idea of what the minimum implementation required is.
2023-05-12 13:29:44 -04:00
Tim Gross
c107b5fd21 testing: improve fidelity of mock driver task restore (#16990)
While working on client status update improvements, I encountered problems
getting tests with the mock driver to correctly restore.

Unlike typical drivers the mock driver doesn't have an external source of truth
for whether the task is running (ex. making API calls to `dockerd` or looking
for a running PID), and so in order to make up that information, it re-parses
the original task config. But the taskrunner doesn't call the encoding step for
`RecoverTask`, only `StartTask`, so the task config the mock driver gets is
missing data.

Update the mock driver to stash the "external" state in the task state that
we'll get from the task runner, so that we don't have to try to recover from the
original `TaskConfig` anymore. This should bring the mock driver closer to the
behavior of the other drivers.
2023-04-27 11:54:10 -04:00
hashicorp-copywrite[bot]
f005448366 [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
Lance Haig
3160c76209 deps: Update ioutil library references to os and io respectively for drivers package (#16331)
* Update ioutil library references to os and io respectively for drivers package

No user facing changes so I assume no change log is required

* Fix failing tests
2023-03-08 10:31:09 -06:00
James Rasell
80b4eeaba5 Merge branch 'main' into tlefebvre/fix-wrong-drivernetworkmanager-interface 2022-03-17 09:38:13 +01:00
Thomas Lefebvre
4c9f476d32 fix: update incorrect DriverNetworkManager interface implementation in plugins/drivers/client.go and drivers/mock/driver.go
And add assertions to catch drifts at compilation time.
2022-03-15 11:51:01 -07:00
Seth Hoenig
b242957990 ci: swap ci parallelization for unconstrained gomaxprocs 2022-03-15 12:58:52 -05:00
Kris Hicks
85ed8ddd4f Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
Mahmood Ali
d6c75e301e cleanup driver eventor goroutines
This fixes few cases where driver eventor goroutines are leaked during
normal operations, but especially so in tests.

This change makes few modifications:

First, it switches drivers to use `Context`s to manage shutdown events.
Previously, it relied on callers invoking `.Shutdown()` function that is
specific to internal drivers only and require casting.  Using `Contexts`
provide a consistent idiomatic way to manage lifecycle for both internal
and external drivers.

Also, I discovered few places where we don't clean up a temporary driver
instance in the plugin catalog code, where we dispense a driver to
inspect and validate the schema config without properly cleaning it up.
2020-05-26 11:04:04 -04:00
Tim Gross
8860b72bc3 volumes: return better error messages for unsupported task drivers (#8030)
When an allocation runs for a task driver that can't support volume mounts,
the mounting will fail in a way that can be hard to understand. With host
volumes this usually means failing silently, whereas with CSI the operator
gets inscrutable internals exposed in the `nomad alloc status`.

This changeset adds a MountConfig field to the task driver Capabilities
response. We validate this when the `csi_hook` or `volume_hook` fires and
return a user-friendly error.

Note that we don't currently have a way to get driver capabilities up to the
server, except through attributes. Validating this when the user initially
submits the jobspec would be even better than what we're doing here (and could
be useful for all our other capabilities), but that's out of scope for this
changeset.

Also note that the MountConfig enum starts with "supports all" in order to
support community plugins in a backwards compatible way, rather than cutting
them off from volume mounting unexpectedly.
2020-05-21 09:18:02 -04:00
Nick Ethier
4a8a96fa1a ar: initial driver based network management 2019-07-31 01:03:17 -04:00
Michael Schurter
4b854cc557 drivers/mock: implement InspectTask 2019-05-14 10:53:27 -07:00
Mahmood Ali
cb4ad3fb45 drivers/mock: implement nomad exec interface 2019-05-09 16:49:08 -04:00
Mahmood Ali
e79ce1f9d0 drivers/mock: extract command related operations
Extract command parsing and execution mocking into a separate struct.  Also,
allow mocking of different fs_isolation for testing.
2019-04-30 14:02:16 -04:00
Mahmood Ali
714c41185c rename fifo methods for clarity 2019-04-01 16:52:58 -04:00
Michael Schurter
158c74887e goimports until make check is happy 2019-01-23 06:27:14 -08:00
Michael Schurter
0d61ff0fb9 move pluginutils -> helper/pluginutils
I wanted a different color bikeshed, so I get to paint it
2019-01-22 15:50:08 -08:00
Alex Dadgar
fe2fa21a7d gofmt 2019-01-22 15:43:34 -08:00
Alex Dadgar
c19cd2e5cf loader and singleton 2019-01-22 15:11:57 -08:00
Michael Schurter
418d360d19 test: port SignalFailure test from 0.8
Also fix signal error handling in mock_driver.
2019-01-22 08:08:08 -08:00
Nick Ethier
994c66f7d7 drivers: use consts for task handle version 2019-01-18 18:31:01 -05:00
Nick Ethier
07cdedec2f driver: add pre09 migration logic 2019-01-18 18:31:01 -05:00
Nick Ethier
fbf9a4c772 executor: implement streaming stats API
plugins/driver: update driver interface to support streaming stats

client/tr: use streaming stats api

TODO:
 * how to handle errors and closed channel during stats streaming
 * prevent tight loop if Stats(ctx) returns an error

drivers: update drivers TaskStats RPC to handle streaming results

executor: better error handling in stats rpc

docker: better control and error handling of stats rpc

driver: allow stats to return a recoverable error
2019-01-12 12:18:22 -05:00
Mahmood Ali
800a3522e3 drivers: re-export ResourceUsage structs
Re-export the ResourceUsage structs in drivers package to avoid drivers
directly depending on the internal client/structs package directly.

I attempted moving the structs to drivers, but that caused some import
cycles that was a bit hard to disentagle.  Alternatively, I added an
alias here that's sufficient for our purposes of avoiding external
drivers depend on internal packages, while allowing us to restructure
packages in future without breaking source compatibility.
2019-01-08 09:11:47 -05:00
Mahmood Ali
c0162fab35 move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Mahmood Ali
694e3010c2 use drivers.FSIsolation 2019-01-08 09:11:47 -05:00
Danielle Tomlinson
476e44b4e4 drivers: Implement InternalPluginDriver interface
This implements the InternalPluginDriver interface in each driver, and
calls the cancellation fn for their respective eventers.

This fixes a per task goroutine leak during test suite execution.
2019-01-08 13:49:31 +01:00
Alex Dadgar
19e67a0916 Test recovery 2019-01-07 14:49:41 -08:00
Alex Dadgar
144866a87b Mock driver has recovery, stats 2019-01-07 14:49:40 -08:00
Preetha Appan
26594aa31e Standardize driver health description messages for all drivers 2019-01-06 22:06:38 -06:00
Alex Dadgar
ed4f8eac6e Add plugin API versioning to plugin loader and plugins 2018-12-18 16:48:00 -08:00
Preetha Appan
829bf74aa8 modify fingerprint interface to use typed attribute struct 2018-11-28 10:01:03 -06:00
Michael Schurter
43b359914b client: interpolate driver configurations
Also add missing SetDriverNetwork calls.
2018-11-15 16:25:57 -08:00
Mahmood Ali
851b275afc Merge pull request #4858 from hashicorp/b-fix-master-20181109
Fix some tests in master
2018-11-13 16:08:26 -05:00
Mahmood Ali
9d6a362b94 Use materialized duration fields for driver config 2018-11-13 10:21:40 -05:00
Mahmood Ali
416b5240f4 Handle time.Duration in mock
Mock driver config uses `time.Duration` fields but we initialize them
inconsistently, as time.Duration sometimes and as duration strings other
times.  Previously, `mapstructure` handles it and does the right thing.

This is no longer the case with MsgPack.  I could not find a good way to
bring back old behavior without too much complexity.  `MsgPack` extended
types weren't ideal here as we lose type information (e.g. int64 vs
string), and the input is a generic map and not a MsgPack serialization
of duration.

As such, I went with the simple solution of declaring the config field
as duration string, and panicing if the test doesn't pass a valid
string.

I found this to cause the smallest change in tests, but we can
alternatively force all to be int64 instead.
2018-11-13 10:21:40 -05:00
Alex Dadgar
9d42f4d039 Plugin client's handle plugin dying
This PR plumbs the plugins done ctx through the base and driver plugin
clients (device already had it). Further, it adds generic handling of
gRPC stream errors.
2018-11-12 17:09:27 -08:00
Michael Schurter
fdbe446ea6 client: first pass at implementing task restoring
Task restoring works but dead tasks may be restarted
2018-11-05 12:32:05 -08:00
Michael Schurter
e949416e12 drivers/mock: standardize names/code 2018-10-31 11:52:51 -07:00
Michael Schurter
55d79c6022 mock_driver: match other's fingerprint message 2018-10-30 17:38:23 -07:00
Michael Schurter
9fb39b35c6 drivers: remove stutter from exported driver names
Also fix a comment on the logger that got copy/pasted around.
2018-10-30 14:05:31 -07:00
Nick Ethier
da7563b8c3 Merge pull request #4795 from hashicorp/f-plugin-config
Pass client configuration to plugins through loader
2018-10-29 18:42:27 -07:00
Nick Ethier
95d381cff7 rename NomadConfig to ClientAgentConfig 2018-10-29 21:34:34 -04:00
Nick Ethier
3244a4cc57 plumb NomadConfig into plugins 2018-10-16 22:47:22 -04:00
Alex Dadgar
a10d3964d0 Do not use cty in drivers 2018-10-16 17:17:07 -07:00
Michael Schurter
9c4a1d4c28 drivers/mock: fix plugin name
Was mock_driver before plugins, so keep the name.
2018-10-16 16:56:56 -07:00