Commit Graph

335 Commits

Author SHA1 Message Date
Seth Hoenig
1ff1132288 services: Set Nomad's User-Agent by default on HTTP checks for nomad services (#16248) 2023-02-23 08:10:42 -06:00
Charlie Voiselle
d0f9008007 Add warnings to var put for non-alphanumeric keys. (#15933)
* Warn when Items key isn't directly accessible

Go template requires that map keys are alphanumeric for direct access
using the dotted reference syntax. This warns users when they create
keys that run afoul of this requirement.

- cli: use regex to detect invalid indentifiers in var keys
- test: fix slash in escape test case
- api: share warning formatting function between API and CLI
- ui: warn if var key has characters other than _, letter, or number

---------
Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-02-13 16:14:59 -05:00
Seth Hoenig
87e7ea3609 users: create cache for user lookups (#16100)
* users: create cache for user lookups

This PR introduces a global cache for OS user lookups. This should
relieve pressure on the OS domain/directory lookups, which would be
queried more now that Task API exists.

Hits are cached for 1 hour, and misses are cached for 1 minute. These
values are fairly arbitrary - we can tweak them if there is any reason to.

Closes #16010

* users: delete expired negative entry from cache
2023-02-09 08:37:50 -06:00
Seth Hoenig
77ea4e3239 users: eliminate LookupGroupId and its one use case (#16093)
This PR deletes the user.LookupGroupId function as it was only being used
in a single test case, and its value was not important to the test.
2023-02-08 14:57:09 -06:00
Michael Schurter
9bab96ebd3 Task API via Unix Domain Socket (#15864)
This change introduces the Task API: a portable way for tasks to access Nomad's HTTP API. This particular implementation uses a Unix Domain Socket and, unlike the agent's HTTP API, always requires authentication even if ACLs are disabled.

This PR contains the core feature and tests but followup work is required for the following TODO items:

- Docs - might do in a followup since dynamic node metadata / task api / workload id all need to interlink
- Unit tests for auth middleware
- Caching for auth middleware
- Rate limiting on negative lookups for auth middleware

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
2023-02-06 11:31:22 -08:00
Charlie Voiselle
fe4ff5be2a Add option to expose workload token to task (#15755)
Add `identity` jobspec block to expose workload identity tokens to tasks.

---------

Co-authored-by: Anders <mail@anars.dk>
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2023-02-02 10:59:14 -08:00
Seth Hoenig
f05aa6d5ec vault: configure user agent on Nomad vault clients (#15745)
* vault: configure user agent on Nomad vault clients

This PR attempts to set the User-Agent header on each Vault API client
created by Nomad. Still need to figure a way to set User-Agent on the
Vault client created internally by consul-template.

* vault: fixup find-and-replace gone awry
2023-01-10 10:39:45 -06:00
Seth Hoenig
dab4d7ed7a ci: swap freeport for portal in packages (#15661) 2023-01-03 11:25:20 -06:00
Piotr Kazmierczak
eacecb8285 acl: modify update endpoints behavior (#15580)
API and RPC endpoints for ACLAuthMethods and ACLBindingRules should allow users
to send incomplete objects in order to, e.g., update single fields. This PR
provides "merging" functionality for these endpoints.
2022-12-20 11:22:19 +01:00
James Rasell
7dbbf6bc58 acl: add binding rule object state schema and functionality. (#15511)
This change adds a new table that will store ACL binding rule
objects. The two indexes allow fast lookups by their ID, or by
which auth method they are linked to. Snapshot persist and
restore functionality ensures this table can be saved and
restored from snapshots.

In order to write and delete the object to state, new Raft messages
have been added.

All RPC request and response structs, along with object functions
such as diff and canonicalize have been included within this work
as it is nicely separated from the other areas of work.
2022-12-14 08:48:18 +01:00
Seth Hoenig
6732761d91 pointer: add Merge helper function for merging pointers (#15499)
This PR adds Merge() helper function for choosing which value of two pointers
to use during a larger merge operation.

If 'next' is not nil, use that value, otherwise use the 'previous' value.
2022-12-08 11:09:22 -06:00
Seth Hoenig
cfc67c3422 client: sandbox go-getter subprocess with landlock (#15328)
* client: sandbox go-getter subprocess with landlock

This PR re-implements the getter package for artifact downloads as a subprocess.

Key changes include

On all platforms, run getter as a child process of the Nomad agent.
On Linux platforms running as root, run the child process as the nobody user.
On supporting Linux kernels, uses landlock for filesystem isolation (via go-landlock).
On all platforms, restrict environment variables of the child process to a static set.
notably TMP/TEMP now points within the allocation's task directory
kernel.landlock attribute is fingerprinted (version number or unavailable)
These changes make Nomad client more resilient against a faulty go-getter implementation that may panic, and more secure against bad actors attempting to use artifact downloads as a privilege escalation vector.

Adds new e2e/artifact suite for ensuring artifact downloading works.

TODO: Windows git test (need to modify the image, etc... followup PR)

* landlock: fixup items from cr

* cr: fixup tests and go.mod file
2022-12-07 16:02:25 -06:00
Tim Gross
47c2d4ab34 Pre forwarding authentication (#15417)
Upcoming work to instrument the rate of RPC requests by consumer (and eventually
rate limit) require that we authenticate a RPC request before forwarding. Add a
new top-level `Authenticate` method to the server and have it return an
`AuthenticatedIdentity` struct. RPC handlers will use the relevant fields of
this identity for performing authorization.

This changeset includes:
* The main implementation of `Authenticate`
* Provide a new RPC `ACL.WhoAmI` for debugging authentication. This endpoint
  returns the same `AuthenticatedIdentity` that will be used by RPC handlers. At
  some point we might want to give this an equivalent HTTP endpoint but I didn't
  want to add that to our public API until some of the other Workload Identity
  work is solidified, especially if we don't need it yet.
* A full coverage test of the `Authenticate` method. This sets up two server
  nodes with mTLS and ACLs, some tokens, and some allocations with workload
  identities.
* Wire up an example of using `Authenticate` in the `Namespace.Upsert` RPC and
  see how authorization happens after forwarding.
* A new semgrep rule for `Authenticate`, which we'll need to update once we're
  ready to wire up more RPC endpoints with authorization steps.
2022-12-06 14:44:03 -05:00
Lance Haig
8667dc2607 Add command "nomad tls" (#14296) 2022-11-22 14:12:07 -05:00
Seth Hoenig
5f3f52156e client: avoid unconsumed channel in timer construction (#15215)
* client: avoid unconsumed channel in timer construction

This PR fixes a bug introduced in #11983 where a Timer initialized with 0
duration causes an immediate tick, even if Reset is called before reading the
channel. The fix is to avoid doing that, instead creating a Timer with a non-zero
initial wait time, and then immediately calling Stop.

* pr: remove redundant stop
2022-11-11 09:31:34 -06:00
Piotr Kazmierczak
02253e6f19 acl: sso auth method schema and store functions (#15191)
This PR implements ACLAuthMethod type, acl_auth_methods table schema and crud state store methods. It also updates nomadSnapshot.Persist and nomadSnapshot.Restore methods in order for them to work with the new table, and adds two new Raft messages: ACLAuthMethodsUpsertRequestType and ACLAuthMethodsDeleteRequestType

This PR is part of the SSO work captured under ☂️ ticket #13120.
2022-11-10 19:42:41 +01:00
Seth Hoenig
9e9ddbdd3b helpers: lockfree lookup of nobody user on unix systems (#14866)
* helpers: lockfree lookup of nobody user on linux and darwin

This PR continues the nobody user lookup saga, by making the nobody
user lookup lock-free on linux and darwin.

By doing the lookup in an init block this originally broke on Windows,
where we must avoid doing the lookup at all. We can get around that
breakage by only doing the lookup on linux/darwin where the nobody
user is going to exist.

Also return the nobody user by value so that a copy is created that
cannot be modified by callers of Nobody().

* helper: move nobody code into unix file
2022-10-11 08:38:05 -05:00
Seth Hoenig
fed329883e cleanup: rename Equals to Equal for consistency (#14759) 2022-10-10 09:28:46 -05:00
Tim Gross
d3a55915f5 client: defer nobody user lookup so Windows doesn't panic (#14790)
In #14742 we introduced a cached lookup of the `nobody` user, which is only ever
called on Unixish machines. But the initial caching was being done in an `init`
block, which meant it was being run on Windows as well. This prevents the Nomad
agent from starting on Windows.

An alternative fix here would be to have a separate `init` block for Windows and
Unix, but this potentially masks incorrect behavior if we accidentally added a
call to the `Nobody()` method on Windows later. This way we're forced to handle
the error in the caller.
2022-10-04 11:52:12 -04:00
Seth Hoenig
e4e5bc5cef client: protect user lookups with global lock (#14742)
* client: protect user lookups with global lock

This PR updates Nomad client to always do user lookups while holding
a global process lock. This is to prevent concurrency unsafe implementations
of NSS, but still enabling NSS lookups of users (i.e. cannot not use osusergo).

* cl: add cl
2022-09-29 09:30:13 -05:00
Seth Hoenig
211ac8ec23 deps: update set and test (#14680)
This PR updates go-set and shoenig/test, which introduced some breaking
API changes.
2022-09-26 08:28:03 -05:00
Seth Hoenig
ff1a30fe8d cleanup more helper updates (#14638)
* cleanup: refactor MapStringStringSliceValueSet to be cleaner

* cleanup: replace SliceStringToSet with actual set

* cleanup: replace SliceStringSubset with real set

* cleanup: replace SliceStringContains with slices.Contains

* cleanup: remove unused function SliceStringHasPrefix

* cleanup: fixup StringHasPrefixInSlice doc string

* cleanup: refactor SliceSetDisjoint to use real set

* cleanup: replace CompareSliceSetString with SliceSetEq

* cleanup: replace CompareMapStringString with maps.Equal

* cleanup: replace CopyMapStringString with CopyMap

* cleanup: replace CopyMapStringInterface with CopyMap

* cleanup: fixup more CopyMapStringString and CopyMapStringInt

* cleanup: replace CopySliceString with slices.Clone

* cleanup: remove unused CopySliceInt

* cleanup: refactor CopyMapStringSliceString to be generic as CopyMapOfSlice

* cleanup: replace CopyMap with maps.Clone

* cleanup: run go mod tidy
2022-09-21 14:53:25 -05:00
Michael Schurter
2a9a361a35 2 small data race fixes in logmon and check tests (#14538)
* logmon: fix data race around oldestLogFileIdx

* checks: fix 2 data races in tests

* logmon: move & rename lock to logically group
2022-09-13 12:54:06 -07:00
Seth Hoenig
75b30c2210 helper: guard against negative inputs into random stagger
This PR modifies RandomStagger to protect against negative input
values. If the given interval is negative, the value returned will
be somewhere in the stratosphere. Instead, treat negative inputs
like zero, returning zero.
2022-09-08 09:17:48 -05:00
Tim Gross
b7fea76f7f keyring: wrap root key in key encryption key (#14388)
Update the on-disk format for the root key so that it's wrapped with a unique
per-key/per-server key encryption key. This is a bit of security theatre for the
current implementation, but it uses `go-kms-wrapping` as the interface for
wrapping the key. This provides a shim for future support of external KMS such
as cloud provider APIs or Vault transit encryption.

* Removes the JSON serialization extension we had on the `RootKey` struct; this
  struct is now only used for key replication and not for disk serialization, so
  we don't need this helper.

* Creates a helper for generating cryptographically random slices of bytes that
  properly accounts for short reads from the source.

* No observable functional changes outside of the on-disk format, so there are
  no test updates.
2022-08-30 10:59:25 -04:00
Seth Hoenig
9206952d1c Merge pull request #14290 from hashicorp/cleanup-more-helper-cleanup
cleanup: tidy up helper package some more
2022-08-30 08:19:48 -05:00
James Rasell
bf46203930 Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main 2022-08-30 08:59:13 +01:00
Seth Hoenig
61fd9436ab Merge pull request #14143 from hashicorp/cleanup-slice-sets-3
cleanup: more cleanup of slices that are really sets
2022-08-29 13:52:59 -05:00
Seth Hoenig
5faa4e08e8 cleanup: cleanup more slice-set comparisons 2022-08-29 12:04:21 -05:00
Tim Gross
d1faead371 rename SecureVariables to Variables throughout 2022-08-26 16:06:24 -04:00
Seth Hoenig
7788f09567 cleanup: create pointer.Compare helper function
This PR creates a pointer.Compare helper for comparing equality of
two pointers. Strictly only works with primitive types we know are
safe to derefence and compare using '=='.
2022-08-26 08:55:59 -05:00
James Rasell
7b3bd1017d Merge branch 'main' into f-gh-13120-sso-umbrella-merged-main 2022-08-25 12:14:29 +01:00
Seth Hoenig
1b1a68e42f cleanup: move fs helpers into escapingfs 2022-08-24 14:45:34 -05:00
Seth Hoenig
7f5dfe4478 cleanup: remove more copies of min/max from helper 2022-08-24 09:56:15 -05:00
Michael Schurter
01648e615a client: fix data races in config handling (#14139)
Before this change, Client had 2 copies of the config object: config and configCopy. There was no guidance around which to use where (other than configCopy's comment to pass it to alloc runners), both are shared among goroutines and mutated in data racy ways. At least at one point I think the idea was to have `config` be mutable and then grab a lock to overwrite `configCopy`'s pointer atomically. This would have allowed alloc runners to read their config copies in data race safe ways, but this isn't how the current implementation worked.

This change takes the following approach to safely handling configs in the client:

1. `Client.config` is the only copy of the config and all access must go through the `Client.configLock` mutex
2. Since the mutex *only protects the config pointer itself and not fields inside the Config struct:* all config mutation must be done on a *copy* of the config, and then Client's config pointer is overwritten while the mutex is acquired. Alloc runners and other goroutines with the old config pointer will not see config updates.
3. Deep copying is implemented on the Config struct to satisfy the previous approach. The TLS Keyloader is an exception because it has its own internal locking to support mutating in place. An unfortunate complication but one I couldn't find a way to untangle in a timely fashion.
4. To facilitate deep copying I made an *internally backward incompatible API change:* our `helper/funcs` used to turn containers (slices and maps) with 0 elements into nils. This probably saves a few memory allocations but makes it very easy to cause panics. Since my new config handling approach uses more copying, it became very difficult to ensure all code that used containers on configs could handle nils properly. Since this code has caused panics in the past, I fixed it: nil containers are copied as nil, but 0-element containers properly return a new 0-element container. No more "downgrading to nil!"
2022-08-18 16:32:04 -07:00
Michael Schurter
ccae2bd021 rpc: fix race in conn last used tracking (#14173) 2022-08-17 14:57:53 -07:00
Piotr Kazmierczak
c4be2c6078 cleanup: replace TypeToPtr helper methods with pointer.Of (#14151)
Bumping compile time requirement to go 1.18 allows us to simplify our pointer helper methods.
2022-08-17 18:26:34 +02:00
James Rasell
99531595bb cli: add ability to create and view tokens with ACL role links. 2022-08-17 14:49:52 +01:00
Seth Hoenig
9586e3c00a cleanup: helper funcs for comparing slices of references 2022-08-16 13:47:47 -05:00
Seth Hoenig
0c62f445c3 build: run gofmt on all go source files
Go 1.19 will forecefully format all your doc strings. To get this
out of the way, here is one big commit with all the changes gofmt
wants to make.
2022-08-16 11:14:11 -05:00
Tim Gross
3af6937cf3 move secure variable conflict resolution to state store (#13922)
Move conflict resolution implementation into the state store with a new Apply RPC. 
This also makes the RPC for secure variables much more similar to Consul's KV, 
which will help us support soft deletes in a post-1.4.0 version of Nomad.

Reimplement quotas in the state store functions.

Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>
2022-08-15 11:19:53 -04:00
James Rasell
d6a9c142ca core: add ACL role state schema and functionality. (#13955)
This commit includes the new state schema for ACL roles along with
state interaction functions for CRUD actions.

The change also includes snapshot persist and restore
functionality and the addition of FSM messages for Raft updates
which will come via RPC endpoints.
2022-08-09 09:33:41 +02:00
James Rasell
892ab8a07a Merge branch 'main' into f-gh-13120-sso-umbrella 2022-08-02 08:30:03 +01:00
Seth Hoenig
3b7f82e040 nsd: add support for specifying check.method in nomad service checks
Unblock 'check.method' in service validation. Add tests around making
sure this value gets plumbed through.
2022-08-01 16:13:48 -05:00
Seth Hoenig
4508af8160 Merge pull request #13715 from hashicorp/dev-nsd-checks
client: add support for checks in nomad services
2022-07-21 10:22:57 -05:00
Tim Gross
2fb328249c tests: add a space between node name and timestamp (#13750) 2022-07-13 16:23:03 -04:00
James Rasell
8981c5ae93 core: add ACL token expiry state, struct, and RPC handling. (#13718)
The ACL token state schema has been updated to utilise two new
indexes which track expiration of tokens that are configured with
an expiration TTL or time. A new state function allows listing
ACL expired tokens which will be used by internal garbage
collection.

The ACL endpoint has been modified so that all validation happens
within a single function call. This is easier to understand and
see at a glance. The ACL token validation now also includes logic
for expiry TTL and times. The ACL endpoint upsert tests have been
condensed into a single, table driven test.

There is a new token canonicalize which provides a single place
for token canonicalization, rather than logic spread in the RPC
handler.
2022-07-13 15:40:34 +02:00
Seth Hoenig
b2861f2a9b client: add support for checks in nomad services
This PR adds support for specifying checks in services registered to
the built-in nomad service provider.

Currently only HTTP and TCP checks are supported, though more types
could be added later.
2022-07-12 17:09:50 -05:00
Charlie Voiselle
15d6dde25c Provide mock secure variables implementation (#12980)
* Add SecureVariable mock
* Add SecureVariableStub
* Add SecureVariable Copy and Stub funcs
2022-07-11 13:34:03 -04:00
Tim Gross
9b1bea1bc1 secure variables: initial state store (#12932)
Implement the core SecureVariable and RootKey structs in memdb,
provide the minimal skeleton for FSM, and a dummy storage and keyring
RPC endpoint.
2022-07-11 13:34:01 -04:00