Commit Graph

74 Commits

Author SHA1 Message Date
Michael Smithhisler
1089b8893e secrets: refactor template providers to hold secrets in memory (#26506) 2025-09-05 16:08:23 -04:00
Michael Smithhisler
85a2875183 task: adds ability to interpret values from secrets hook (#26261) 2025-09-04 15:58:03 -04:00
Deniz Onur Duzgun
abd0efdd76 sec: remove non-hermetic sprig template functions (#25998)
* sec:add sprig template functions in denylists

* remove explicit set which is no longer needed

* go mod tidy

* add changelog

* better changelog and filtered denylist

* go mod tidy with 1.24.4

* edit changelog and remove htpasswd and derive

* fix tests

* Update client/allocrunner/taskrunner/template/template_test.go

Co-authored-by: Tim Gross <tgross@hashicorp.com>

* edit changelog

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2025-06-09 13:00:47 -04:00
Michael Smithhisler
4c8257d0c7 client: add once mode to template block (#25922) 2025-05-28 11:45:11 -04:00
James Rasell
c0eccda4f7 template: Set any Consul token generated by workload identity. (#25309) 2025-03-07 14:32:02 +00:00
Michael Smithhisler
5c4d0e923d consul: Remove legacy token based authentication workflow (#25217) 2025-03-05 15:38:11 -05:00
Tim Gross
7b89c0ee28 template: fix client's default retry configuration (#25113)
In #20165 we fixed a bug where a partially configured `client.template` retry
block would set any unset fields to nil instead of their default values. But
this patch introduced a regression in the default values, so we were now
defaulting to unlimited retries if the retry block was unset. Restore the
correct behavior and add better test coverage at both the config parsing and
template configuration code.

Ref: https://github.com/hashicorp/nomad/pull/20165
Ref: https://github.com/hashicorp/nomad/issues/23305#issuecomment-2643731565
2025-02-14 09:25:41 -05:00
Michael Smithhisler
0714353324 fix: handle template re-renders on client restart (#24399)
When multiple templates with api functions are included in a task, it's
possible for consul-template to re-render templates as it creates
watchers, overwriting render event data. This change uses event fields
that do not get overwritten, and only executes the change mode for
templates that were actually written to disk.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-11-08 12:49:38 -05:00
Tim Gross
cc9227b858 template: fix panic in change_mode=script on client restart (#24057)
When we introduced change_mode=script to templates, we passed the driver handle
down into the template manager so we could call its `Exec` method directly. But
the lifecycle of the driver handle is managed by the taskrunner and isn't
available when the template manager is first created. This has led to a series
of patches trying to fixup the behavior (#15915, #15192, #23663, #23917). Part
of the challenge in getting this right is using an interface to avoid the
circular import of the driver handle.

But the taskrunner already has a way to deal with this problem using a "lazy
handle". The other template change modes already use this indirectly through the
`Lifecycle` interface. Change the driver handle `Exec` call in the template
manager to a new `Lifecycle.Exec` call that reuses the existing behavior. This
eliminates the need for the template manager to know anything at all about the
handle state.

Fixes: https://github.com/hashicorp/nomad/issues/24051
2024-09-25 08:59:01 -04:00
Tim Gross
07aca67108 template: lock task handle before trying script check (#23917)
In #23663 we fixed the template hook so that `change_mode="script"` didn't lose
track of the task handle during restores. But this revealed a second bug which
is that access to the handle is not locked while in use, which can allow it to
be removed concurrently.

Fixes: https://github.com/hashicorp/nomad/issues/23875
2024-09-12 08:41:06 -04:00
Tim Gross
b25f1b66ce resources: allow job authors to configure size of secrets tmpfs (#23696)
On supported platforms, the secrets directory is a 1MiB tmpfs. But some tasks
need larger space for downloading large secrets. This is especially the case for
tasks using `templates`, which need extra room to write a temporary file to the
secrets directory that gets renamed to the old file atomically.

This changeset allows increasing the size of the tmpfs in the `resources`
block. Because this is a memory resource, we need to include it in the memory we
allocate for scheduling purposes. The task is already prevented from using more
memory in the tmpfs than the `resources.memory` field allows, but can bypass
that limit by writing to the tmpfs via `template` or `artifact` blocks.

Therefore, we need to account for the size of the tmpfs in the allocation
resources. Simply adding it to the memory needed when we create the allocation
allows it to be accounted for in all downstream consumers, and then we'll
subtract that amount from the memory resources just before configuring the task
driver.

For backwards compatibility, the default value of 1MiB is "free" and ignored by
the scheduler. Otherwise we'd be increasing the allocated resources for every
existing alloc, which could cause problems across upgrades. If a user explicitly
sets `resources.secrets = 1` it will no longer be free.

Fixes: https://github.com/hashicorp/nomad/issues/2481
Ref: https://hashicorp.atlassian.net/browse/NET-10070
2024-08-05 16:06:58 -04:00
Tim Gross
c280891703 template: allow change_mode script to run after client restart (#23663)
For templates with `change_mode = "script"`, we set a driver handle in the
poststart method, so the template runner can execute the script inside the
task. But when the client is restarted and the template contents change during
that window, we trigger a change_mode in the prestart method. In that case, the
hook will not have the handle and so returns an errror trying to run the change
mode.

We restore the driver handle before we call any prestart hooks, so we can pass
that handle in the constructor whenever it's available. In the normal task start
case the handle will be empty but also won't be called.

The error messages are also misleading, as there's no capabilities check
happening here. Update the error messages to match.

Fixes: https://github.com/hashicorp/nomad/issues/15851
Ref: https://hashicorp.atlassian.net/browse/NET-9338
2024-07-24 08:29:39 -04:00
Piotr Kazmierczak
356ea87e00 template: disable sandboxed rendering on Windows (#23432)
Following #23443, we no longer need to sandbox template rendering on Windows.
2024-06-28 17:16:27 +02:00
Tim Gross
13617eee4b template: improve internal documentation around shutdown (#20134)
While investigating a report around possible consul-template shutdown issues,
which didn't bear fruit, I found that some of the logic around template runner
shutdown is unintuitive.

* Add some doc strings to the places where someone might think we should be
  obviously stopping the runner or returning early.
* Mark context argument for `Poststart`, `Stop`, and `Update` hooks as unused.

No functional code changes.
2024-03-14 15:33:32 -04:00
Seth Hoenig
05937ab75b exec2: add client support for unveil filesystem isolation mode (#20115)
* exec2: add client support for unveil filesystem isolation mode

This PR adds support for a new filesystem isolation mode, "Unveil". The
mode introduces a "alloc_mounts" directory where tasks have user-owned
directory structure which are bind mounts into the real alloc directory
structure. This enables a task driver to use landlock (and maybe the
real unveil on openbsd one day) to isolate a task to the task owned
directory structure, providing sandboxing.

* actually create alloc-mounts-dir directory

* fix doc strings about alloc mount dir paths
2024-03-13 08:24:17 -05:00
Tim Gross
df86503349 template: sandbox template rendering
The Nomad client renders templates in the same privileged process used for most
other client operations. During internal testing, we discovered that a malicious
task can create a symlink that can cause template rendering to read and write to
arbitrary files outside the allocation sandbox. Because the Nomad agent can be
restarted without restarting tasks, we can't simply check that the path is safe
at the time we write without encountering a time-of-check/time-of-use race.

To protect Nomad client hosts from this attack, we'll now read and write
templates in a subprocess:

* On Linux/Unix, this subprocess is sandboxed via chroot to the allocation
  directory. This requires that Nomad is running as a privileged process. A
  non-root Nomad agent will warn that it cannot sandbox the template renderer.

* On Windows, this process is sandboxed via a Windows AppContainer which has
  been granted access to only to the allocation directory. This does not require
  special privileges on Windows. (Creating symlinks in the first place can be
  prevented by running workloads as non-Administrator or
  non-ContainerAdministrator users.)

Both sandboxes cause encountered symlinks to be evaluated in the context of the
sandbox, which will result in a "file not found" or "access denied" error,
depending on the platform. This change will also require an update to
Consul-Template to allow callers to inject a custom `ReaderFunc` and
`RenderFunc`.

This design is intended as a workaround to allow us to fix this bug without
creating backwards compatibility issues for running tasks. A future version of
Nomad may introduce a read-only mount specifically for templates and artifacts
so that tasks cannot write into the same location that the Nomad agent is.

Fixes: https://github.com/hashicorp/nomad/issues/19888
Fixes: CVE-2024-1329
2024-02-08 10:40:24 -05:00
Tim Gross
334c383eb6 template: run template tests on Windows where possible (#19856)
We don't run the whole suite of unit tests on all platforms to keep CI times
reasonable, so the only things we've been running on Windows are
platform-specific.

I'm working on some platform-specific `template` related work and having these
tests run on Windows will reduce the risk of regressions. Our Windows CI box
doesn't have Consul or Vault, so I've skipped those tests for the time being,
and can follow up with that later. There's also a test with assertions looking
for specific paths, and the results are different on Windows. I've skipped those
for the moment as well and will follow up under a separate PR.

Also swap `testify` for `shoenig/test`
2024-02-02 09:22:03 -05:00
Tim Gross
50f0ce5412 config: remove old Vault/Consul config blocks from client (#18994)
Remove the now-unused original configuration blocks for Consul and Vault from
the client. When the client needs to refer to a Consul or Vault block it will
always be for a specific cluster for the task/service. Add a helper for
accessing the default clusters (for the client's own use).

This is two of three changesets for this work. The remainder will implement the
same changes in the `command/agent` package.

As part of this work I discovered and fixed two bugs:

* The gRPC proxy socket that we create for Envoy is only ever created using the
  default Consul cluster's configuration. This will prevent Connect from being
  used with the non-default cluster.
* The Consul configuration we use for templates always comes from the default
  Consul cluster's configuration, but will use the correct Consul token for the
  non-default cluster. This will prevent templates from being used with the
  non-default cluster.

Ref: https://github.com/hashicorp/nomad/issues/18947
Ref: https://github.com/hashicorp/nomad/pull/18991
Fixes: https://github.com/hashicorp/nomad/issues/18984
Fixes: https://github.com/hashicorp/nomad/issues/18983
2023-11-07 09:15:37 -05:00
Luiz Aoqui
8b9a5fde4e vault: add multi-cluster support on templates (#18790)
In Nomad Enterprise, a task may connect to a non-default Vault cluster,
requiring `consul-template` to be configured with a specific client
`vault` block.
2023-10-18 20:45:01 -04:00
Luiz Aoqui
349c032369 vault: update task runner vault hook to support workload identity (#18534) 2023-10-16 19:37:57 -04:00
Piotr Kazmierczak
299f3bf74b client: use WI-issued consul tokens in the template_hook (#18752)
ref https://github.com/hashicorp/team-nomad/issues/404
2023-10-16 09:39:20 +02:00
hashicorp-copywrite[bot]
2d35e32ec9 Update copyright file headers to BUSL-1.1 2023-08-10 17:27:15 -05:00
hashicorp-copywrite[bot]
f005448366 [COMPLIANCE] Add Copyright and License Headers 2023-04-10 15:36:59 +00:00
Seth Hoenig
77ea4e3239 users: eliminate LookupGroupId and its one use case (#16093)
This PR deletes the user.LookupGroupId function as it was only being used
in a single test case, and its value was not important to the test.
2023-02-08 14:57:09 -06:00
Charlie Voiselle
52a254ba22 template: error on missing key (#15141)
* Support error_on_missing_value for templates
* Update docs for template stanza
2022-11-04 13:23:01 -04:00
Luiz Aoqui
88b61cb5b4 template: apply splay value on change_mode script (#14749)
Previously, the splay timeout was only applied if a template re-render
caused a restart or a signal action. The `change_mode = "script"` was
running after the `if restart || len(signals) != 0` check, so it was
invoked at all times.

This change refactors the logic so it's easier to notice that new
`change_mode` options should start only after `splay` is applied.
2022-09-30 12:04:22 -04:00
Seth Hoenig
e4e5bc5cef client: protect user lookups with global lock (#14742)
* client: protect user lookups with global lock

This PR updates Nomad client to always do user lookups while holding
a global process lock. This is to prevent concurrency unsafe implementations
of NSS, but still enabling NSS lookups of users (i.e. cannot not use osusergo).

* cl: add cl
2022-09-29 09:30:13 -05:00
Michael Schurter
0eb711925e test: skip chown test if nonroot (#14738)
CI always runs this as root, so it worked there and always scared me
when I ran it locally.
2022-09-28 14:45:38 -07:00
Charlie Voiselle
015e4617b2 Vars: Update CT dependency to support variables. (#14399)
* Update Consul Template dep to support Nomad vars

* Remove `Peering` config for Consul Testservers
Upgrading to the 1.14 Consul SDK introduces and additional default
configuration—`Peering`—that is not compatible with versions of Consul
before v1.13.0. because Nomad tests against Consul v1.11.1, this
configuration has to be nil'ed out before passing it to the Consul
binary.
2022-08-30 15:26:01 -04:00
Tim Gross
13bc6d6d8a testing: setting env var incompatible with parallel tests (#14405)
Neither the `os.Setenv` nor `t.Setenv` helper are safe to use in parallel tests
because environment variables are process-global. The stdlib panics if you try
to do this. Remove the `ci.Parallel()` call from all tests where we're setting
environment variables.
2022-08-30 14:49:03 -04:00
Piotr Kazmierczak
34e4b080f6 template: custom change_mode scripts (#13972)
This PR adds the functionality of allowing custom scripts to be executed on template change. Resolves #2707
2022-08-24 17:43:01 +02:00
Luiz Aoqui
934bafb922 template: use pointer values for gid and uid (#14203)
When a Nomad agent starts and loads jobs that already existed in the
cluster, the default template uid and gid was being set to 0, since this
is the zero value for int. This caused these jobs to fail in
environments where it was not possible to use 0, such as in Windows
clients.

In order to differentiate between an explicit 0 and a template where
these properties were not set we need to use a pointer.
2022-08-22 16:25:49 -04:00
Piotr Kazmierczak
c4be2c6078 cleanup: replace TypeToPtr helper methods with pointer.Of (#14151)
Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.
2022-08-17 18:26:34 +02:00
Derek Strickland
696deb9600 Add Nomad RetryConfig to agent template config (#13907)
* add Nomad RetryConfig to agent template config
2022-08-03 16:56:30 -04:00
Piotr Kazmierczak
2e0b875b14 client: enable specifying user/group permissions in the template stanza (#13755)
* Adds Uid/Gid parameters to template.

* Updated diff_test

* fixed order

* update jobspec and api

* removed obsolete code

* helper functions for jobspec parse test

* updated documentation

* adjusted API jobs test.

* propagate uid/gid setting to job_endpoint

* adjusted job_endpoint tests

* making uid/gid into pointers

* refactor

* updated documentation

* updated documentation

* Update client/allocrunner/taskrunner/template/template_test.go

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>

* Update website/content/api-docs/json-jobs.mdx

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>

* propagating documentation change from Luiz

* formatting

* changelog entry

* changed changelog entry

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-08-02 22:15:38 +02:00
Tim Gross
d3e9b9ac7e workload identity (#13223)
In order to support implicit ACL policies for tasks to get their own
secrets, each task would need to have its own ACL token. This would
add extra raft overhead as well as new garbage collection jobs for
cleaning up task-specific ACL tokens. Instead, Nomad will create a
workload Identity Claim for each task.

An Identity Claim is a JSON Web Token (JWT) signed by the server’s
private key and attached to an Allocation at the time a plan is
applied. The encoded JWT can be submitted as the X-Nomad-Token header
to replace ACL token secret IDs for the RPCs that support identity
claims.

Whenever a key is is added to a server’s keyring, it will use the key
as the seed for a Ed25519 public-private private keypair. That keypair
will be used for signing the JWT and for verifying the JWT.

This implementation is a ruthlessly minimal approach to support the
secure variables feature. When a JWT is verified, the allocation ID
will be checked against the Nomad state store, and non-existent or
terminal allocation IDs will cause the validation to be rejected. This
is sufficient to support the secure variables feature at launch
without requiring implementation of a background process to renew
soon-to-expire tokens.
2022-07-11 13:34:05 -04:00
Eng Zer Jun
fca4ee8e05 test: use T.TempDir to create temporary test directory (#12853)
* test: use `T.TempDir` to create temporary test directory

This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.

Prior to this commit, temporary directory created using `ioutil.TempDir`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
	defer func() {
		if err := os.RemoveAll(dir); err != nil {
			t.Fatal(err)
		}
	}
is also tedious, but `t.TempDir` handles this for us nicely.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix TestLogmon_Start_restart on Windows

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* test: fix failing TestConsul_Integration

t.TempDir fails to perform the cleanup properly because the folder is
still in use

testing.go:967: TempDir RemoveAll cleanup: unlinkat /tmp/TestConsul_Integration2837567823/002/191a6f1a-5371-cf7c-da38-220fe85d10e5/web/secrets: device or resource busy

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-05-12 11:42:40 -04:00
James Rasell
9dc0b88cb5 client: add Nomad template service functionality to runner. (#12458)
This change modifies the template task runner to utilise the
new consul-template which includes Nomad service lookup template
funcs.

In order to provide security and auth to consul-template, we use
a custom HTTP dialer which is passed to consul-template when
setting up the runner. This method follows Vault implementation.

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-04-06 19:17:05 +02:00
Michael Schurter
3ca38ee4ed template: fix comments and docs
Review notes from @lgfa29

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-03-29 09:25:23 -07:00
Michael Schurter
f87ec7e64e template: disallow writeToFile by default
Resolves #12095 by WONTFIXing it.

This approach disables `writeToFile` as it allows arbitrary host
filesystem writes and is only a small quality of life improvement over
multiple `template` stanzas.

This approach has the significant downside of leaving people who have
altered their `template.function_denylist` *still vulnerable!* I added
an upgrade note, but we should have implemented the denylist as a
`map[string]bool` so that new funcs could be denied without overriding
custom configurations.

This PR also includes a bug fix that broke enabling all consul-template
funcs. We repeatedly failed to differentiate between a nil (unset)
denylist and an empty (allow all) one.
2022-03-28 17:05:42 -07:00
Seth Hoenig
b242957990 ci: swap ci parallelization for unconstrained gomaxprocs 2022-03-15 12:58:52 -05:00
Derek Strickland
43edd0e709 Expose Consul template configuration parameters (#11606)
This PR exposes the following existing`consul-template` configuration options to Nomad jobspec authors in the `{job.group.task.template}` stanza.

- `wait`

It also exposes the following`consul-template` configuration to Nomad operators in the `{client.template}` stanza.

- `max_stale`
- `block_query_wait`
- `consul_retry`
- `vault_retry` 
- `wait` 

Finally, it adds the following new Nomad-specific configuration to the `{client.template}` stanza that allows Operators to set bounds on what `jobspec` authors configure.

- `wait_bounds`

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-01-10 10:19:07 -05:00
Michael Schurter
37f053ff89 client: never embed alloc_dir in chroot
Fixes #2522

Skip embedding client.alloc_dir when building chroot. If a user
configures a Nomad client agent so that the chroot_env will embed the
client.alloc_dir, Nomad will happily infinitely recurse while building
the chroot until something horrible happens. The best case scenario is
the filesystem's path length limit is hit. The worst case scenario is
disk space is exhausted.

A bad agent configuration will look something like this:

```hcl
data_dir = "/tmp/nomad-badagent"

client {
  enabled = true

  chroot_env {
    # Note that the source matches the data_dir
    "/tmp/nomad-badagent" = "/ohno"
    # ...
  }
}
```

Note that `/ohno/client` (the state_dir) will still be created but not
`/ohno/alloc` (the alloc_dir).
While I cannot think of a good reason why someone would want to embed
Nomad's client (and possibly server) directories in chroots, there
should be no cause for harm. chroots are only built when Nomad runs as
root, and Nomad disables running exec jobs as root by default. Therefore
even if client state is copied into chroots, it will be inaccessible to
tasks.

Skipping the `data_dir` and `{client,server}.state_dir` is possible, but
this PR attempts to implement the minimum viable solution to reduce risk
of unintended side effects or bugs.

When running tests as root in a vm without the fix, the following error
occurs:

```
=== RUN   TestAllocDir_SkipAllocDir
    alloc_dir_test.go:520:
                Error Trace:    alloc_dir_test.go:520
                Error:          Received unexpected error:
                                Couldn't create destination file /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/testtask/nomad/test/testtask/.../nomad/test/testtask/secrets/.nomad-mount: open /tmp/TestAllocDir_SkipAllocDir1457747331/001/nomad/test/.../testtask/secrets/.nomad-mount: file name too long
                Test:           TestAllocDir_SkipAllocDir
--- FAIL: TestAllocDir_SkipAllocDir (22.76s)
```

Also removed unused Copy methods on AllocDir and TaskDir structs.

Thanks to @eveld for not letting me forget about this!
2021-10-18 09:22:01 -07:00
Seth Hoenig
a97254fa20 consul: plubming for specifying consul namespace in job/group
This PR adds the common OSS changes for adding support for Consul Namespaces,
which is going to be a Nomad Enterprise feature. There is no new functionality
provided by this changeset and hopefully no new bugs.
2021-04-05 10:03:19 -06:00
Kris Hicks
2cd7136bc7 Fix some errcheck errors (#9811)
* Throw away result of multierror.Append

When given a *multierror.Error, it is mutated, therefore the return
value is not needed.

* Simplify MergeMultierrorWarnings, use StringBuilder

* Hash.Write() never returns an error

* Remove error that was always nil

* Remove error from Resources.Add signature

When this was originally written it could return an error, but that was
refactored away, and callers of it as of today never handle the error.

* Throw away results of io.Copy during Bridge

* Handle errors when computing node class in test
2021-01-14 12:46:35 -08:00
Chris Baker
6419966e88 enabled broken test that is no longer broken 2021-01-04 22:25:35 +00:00
Chris Baker
11f0fed935 update template and artifact interpolation to use client-relative paths
resolves #9839
resolves #6929
resolves #6910

e2e: template env interpolation path testing
2021-01-04 22:25:34 +00:00
Tim Gross
004f1c972f template: trigger change_mode for dynamic secrets on restore (#9636)
When a task is restored after a client restart, the template runner will
create a new lease for any dynamic secret (ex. Consul or PKI secrets
engines). But because this lease is being created in the prestart hook, we
don't trigger the `change_mode`.

This changeset uses the the existence of the task handle to detect a
previously running task that's been restored, so that we can trigger the
template `change_mode` if the template is changed, as it will be only with
dynamic secrets.
2020-12-16 13:36:19 -05:00
Michael Schurter
5b83ca0b5d client: skip broken test and fix assertion 2020-11-18 10:01:02 -08:00
Michael Schurter
cd7226d398 client: fix interpolation in template source
While Nomad v0.12.8 fixed `NOMAD_{ALLOC,TASK,SECRETS}_DIR` use in
`template.destination`, interpolating these variables in
`template.source` caused a path escape error.

**Why not apply the destination fix to source?**

The destination fix forces destination to always be relative to the task
directory. This makes sense for the destination as a destination outside
the task directory would be unreachable by the task. There's no reason
to ever render a template outside the task directory. (Using `..` does
allow destinations to escape the task directory if
`template.disable_file_sandbox = true`. That's just awkward and unsafe
enough I hope no one uses it.)

There is a reason to source a template outside a task
directory. At least if there weren't then I can't think of why we
implemented `template.disable_file_sandbox`. So v0.12.8 left the
behavior of `template.source` the more straightforward "Interpolate and
validate."

However, since outside of `raw_exec` every other driver uses absolute
paths for `NOMAD_*_DIR` interpolation, this means those variables are
unusable unless `disable_file_sandbox` is set.

**The Fix**

The variables are now interpolated as relative paths *only for the
purpose of rendering templates.* This is an unfortunate special case,
but reflects the fact that the templates view of the filesystem is
completely different (unconstrainted) vs the task's view (chrooted).
Arguably the values of these variables *should be context-specific.*
I think it's more reasonable to think of the "hack" as templating
running uncontainerized than that giving templates different paths is a
hack.

**TODO**

- [ ] E2E tests
- [ ] Job validation may still be broken and prevent my fix from
      working?

**raw_exec**

`raw_exec` is actually broken _a different way_ as exercised by tests in
this commit. I think we should probably remove these tests and fix that
in a followup PR/release, but I wanted to leave them in for the initial
review and discussion. Since non-containerized source paths are broken
anyway, perhaps there's another solution to this entire problem I'm
overlooking?
2020-11-17 22:03:04 -08:00