Commit Graph

2470 Commits

Author SHA1 Message Date
Drew Bailey
39a6c6374a Fix panic when monitoring a local client node
Fixes a panic when accessing a.agent.Server() when agent is a client
instead. This pr removes a redundant ACL check since ACLs are validated
at the RPC layer. It also nil checks the agent server and uses Client()
when appropriate.
2020-02-03 13:20:04 -05:00
Seth Hoenig
d24d470775 comments: cleanup some leftover debug comments and such 2020-01-31 19:04:35 -06:00
Seth Hoenig
ead935d12c agent: re-enable the server in dev mode 2020-01-31 19:04:19 -06:00
Seth Hoenig
9f48d83378 nomad: handle SI token revocations concurrently
Be able to revoke SI token accessors concurrently, and also
ratelimit the requests being made to Consul for the various
ACL API uses.
2020-01-31 19:04:14 -06:00
Seth Hoenig
d85cccc8d0 nomad: fixup token policy validation 2020-01-31 19:04:08 -06:00
Seth Hoenig
674ccaa122 nomad: proxy requests for Service Identity tokens between Clients and Consul
Nomad jobs may be configured with a TaskGroup which contains a Service
definition that is Consul Connect enabled. These service definitions end
up establishing a Consul Connect Proxy Task (e.g. envoy, by default). In
the case where Consul ACLs are enabled, a Service Identity token is required
for these tasks to run & connect, etc. This changeset enables the Nomad Server
to recieve RPC requests for the derivation of SI tokens on behalf of instances
of Consul Connect using Tasks. Those tokens are then relayed back to the
requesting Client, which then injects the tokens in the secrets directory of
the Task.
2020-01-31 19:03:53 -06:00
Seth Hoenig
0040c75e8e command, docs: create and document consul token configuration for connect acls (gh-6716)
This change provides an initial pass at setting up the configuration necessary to
enable use of Connect with Consul ACLs. Operators will be able to pass in a Consul
Token through `-consul-token` or `$CONSUL_TOKEN` in the `job run` and `job revert`
commands (similar to Vault tokens).

These values are not actually used yet in this changeset.
2020-01-31 19:02:53 -06:00
Michael Schurter
e3e1f5cb53 core: add limits to unauthorized connections
Introduce limits to prevent unauthorized users from exhausting all
ephemeral ports on agents:

 * `{https,rpc}_handshake_timeout`
 * `{http,rpc}_max_conns_per_client`

The handshake timeout closes connections that have not completed the TLS
handshake by the deadline (5s by default). For RPC connections this
timeout also separately applies to first byte being read so RPC
connections with TLS enabled have `rpc_handshake_time * 2` as their
deadline.

The connection limit per client prevents a single remote TCP peer from
exhausting all ephemeral ports. The default is 100, but can be lowered
to a minimum of 26. Since streaming RPC connections create a new TCP
connection (until MultiplexV2 is used), 20 connections are reserved for
Raft and non-streaming RPCs to prevent connection exhaustion due to
streaming RPCs.

All limits are configurable and may be disabled by setting them to `0`.

This also includes a fix that closes connections that attempt to create
TLS RPC connections recursively. While only users with valid mTLS
certificates could perform such an operation, it was added as a
safeguard to prevent programming errors before they could cause resource
exhaustion.
2020-01-30 10:38:25 -08:00
Mahmood Ali
b789b507d1 Merge pull request #6922 from hashicorp/b-alloc-canoncalize
Handle Upgrades and Alloc.TaskResources modification
2020-01-28 15:12:41 -05:00
Mahmood Ali
eb0acc3301 Merge pull request #6935 from hashicorp/b-default-preemption-flag
scheduler: allow configuring default preemption for system scheduler
2020-01-28 15:11:06 -05:00
Mahmood Ali
31025d6cac Support customizing full scheduler config 2020-01-28 14:51:42 -05:00
Nick Ethier
d6dd9b61ef consul: fix var name from rebase 2020-01-27 14:00:19 -05:00
Nick Ethier
330d24cb80 consul: fix var name from rebase 2020-01-27 12:55:52 -05:00
Nick Ethier
64f4e9e691 consul: add support for canary meta 2020-01-27 09:53:30 -05:00
Danielle
2121d57281 cli: add system command and subcmds to interact with system API. (#6924)
cli: add system command and subcmds to interact with system API.
2020-01-13 16:16:08 +01:00
Mahmood Ali
744c9a485d scheduler: allow configuring default preemption for system scheduler
Some operators want a greater control over when preemption is enabled,
especially during an upgrade to limit potential side-effects.
2020-01-13 08:30:49 -05:00
James Rasell
0ea8fdfa81 cli: add system command and subcmds to interact with system API.
The system command includes gc and reconcile-summaries subcommands
which covers all currently available system API calls. The help
information is largely pulled from the current Nomad website API
documentation.
2020-01-13 11:34:46 +01:00
Drew Bailey
a58b8a5e9c refactor api profile methods
comment why we ignore errors parsing params
2020-01-09 15:15:12 -05:00
Drew Bailey
ad86438fc0 adds qc param, address pr feedback 2020-01-09 15:15:11 -05:00
Drew Bailey
28265086fc condense table test 2020-01-09 15:15:10 -05:00
Drew Bailey
549045fcbb Rename profile package to pprof
Address pr feedback, rename profile package to pprof to more accurately
describe its purpose. Adds gc param for heap lookup profiles.
2020-01-09 15:15:10 -05:00
Drew Bailey
1776458956 address pr feedback 2020-01-09 15:15:09 -05:00
Drew Bailey
a3f73b3e06 leave acl checking to rpc endpoints
fix test expectation

test wrapNonJSON
2020-01-09 15:15:08 -05:00
Drew Bailey
db382d3195 provide helpful error, cleanup logic 2020-01-09 15:15:08 -05:00
Drew Bailey
11563dca1c prevent doubly wrapping with rpc error 2020-01-09 15:15:07 -05:00
Drew Bailey
d77b5add6c RPC server EnableDebug option
Passes in agent enable_debug config to nomad server and client configs.
This allows for rpc endpoints to have more granular control if they
should be enabled or not in combination with ACLs.

enable debug on client test
2020-01-09 15:15:07 -05:00
Drew Bailey
328075591f region forwarding; prevent recursive forwards for impossible requests
prevent region forwarding loop, backfill tests

fix failing test
2020-01-09 15:15:06 -05:00
Drew Bailey
390e22e421 move shared structs out of client and into nomad 2020-01-09 15:15:05 -05:00
Drew Bailey
57dc0c6a46 test pprof headers and profile methods
tidy up, add comments

clean up seconds param assignment
2020-01-09 15:15:04 -05:00
Drew Bailey
c28e5ad036 warn when enabled debug is on when registering
m -> a receiver name

return codederrors, fix query
2020-01-09 15:15:04 -05:00
Drew Bailey
d077cfe763 acl and debug test table
rename implementation method
2020-01-09 15:15:03 -05:00
Drew Bailey
fb1b4cdc26 Server request forwarding for Agent.Profile
Return rpc errors for profile requests, set up remote forwarding to
target leader or server id for profile requests.

server forwarding, endpoint tests
2020-01-09 15:15:03 -05:00
Drew Bailey
3575f1777f test for known pprof endpoints 2020-01-09 15:15:02 -05:00
Drew Bailey
240c0ee0ec agent pprof endpoints
wip, agent endpoint and client endpoint for pprof profiles

agent endpoint test
2020-01-09 15:15:02 -05:00
Mahmood Ali
db49137116 CLI: protect against AllocatedResources being nil 2020-01-08 17:22:05 -05:00
Charlie Voiselle
c16c69a858 Typo fix
Synopsis needs to start with uppercase to match other commands
2020-01-08 10:44:00 -05:00
James Rasell
8e6a7bce69 cli: include namespace in output when querying job stauts. (#6912) 2020-01-08 08:24:03 -05:00
Michael Schurter
fc1a1f5901 Merge pull request #6898 from hashicorp/hicks/fix-typo
Fix typo, Ethier -> Either
2020-01-02 14:52:18 -08:00
Kris Hicks
e9a9ef7d56 Fix typo, Ethier -> Either 2020-01-02 14:42:27 -08:00
Charlie Voiselle
0ff7c64e20 cli: Allow user to specify dest filename for nomad init (#6520)
* Allow user to specify dest filename for nomad init
* Create changelog entry for GH-6520
2019-12-19 14:59:12 -05:00
Drew Bailey
f57574287c Merge pull request #6746 from hashicorp/f-shutdown-delay-tg
Group shutdown_delay
2019-12-18 16:01:30 -05:00
Lang Martin
309b4ff17a test: quota: relax multierror message matching to Contains 2019-12-17 13:20:14 -05:00
Lang Martin
7ea127e64e test: build quota_apply_test, remove the tests that require ent 2019-12-17 13:20:14 -05:00
Drew Bailey
8c1f9b4128 docs for shutdown delay
update docs, address pr comments

ensure pointer is not nil

use pointer for diff tests, set vs unset
2019-12-16 11:38:35 -05:00
Drew Bailey
672b76056b shutdown delay for task groups
copy struct values

ensure groupserviceHook implements RunnerPreKillhook

run deregister first

test that shutdown times are delayed

move magic number into variable
2019-12-16 11:38:16 -05:00
Mahmood Ali
af2e2bc7ed cli: sequence cli.Ui operations
Fixes a bug where if a command flag parsing errors, the resulting error
and help usage messages get interleaved in unexpected and non-user
friendly way.

The reason is that we have flag parsing library effectively writes to
ui.Error in a goroutine.  This is problematic: first, we lose the sequencing between help
usage and error message; second, cli.Ui methods are not concurrent safe.

Here, we introduce a custom error writer that buffers result and calls
ui.Error() in the write method and in the same goroutine.

For context, we need to wrap ui.Error because it's line-oriented, while
flags library expects a io.Writer which is bytes oriented.
2019-12-16 10:08:17 -05:00
Danielle
34a5a3a6a6 Merge pull request #6828 from hashicorp/b/nomad-monitor-panic
command: error when no node is found for `monitor`
2019-12-10 14:29:32 +01:00
Danielle Lancashire
c91f8da7f0 command: error when no node is found for monitor
Currently `nomad monitor -node-id` will panic when a node-id does not
match any nodes, as there is no empty result bounds checking. Here we
return an error to the user when no nodes are found.
2019-12-10 13:10:47 +01:00
Seth Hoenig
94c60b4cfa tests: swap lib/freeport for tweaked helper/freeport
Copy the updated version of freeport (sdk/freeport), and tweak it for use
in Nomad tests. This means staying below port 10000 to avoid conflicts with
the lib/freeport that is still transitively used by the old version of
consul that we vendor. Also provide implementations to find ephemeral ports
of macOS and Windows environments.

Ports acquired through freeport are supposed to be returned to freeport,
which this change now also introduces. Many tests are modified to include
calls to a cleanup function for Server objects.

This should help quite a bit with some flakey tests, but not all of them.
Our port problems will not go away completely until we upgrade our vendor
version of consul. With Go modules, we'll probably do a 'replace' to swap
out other copies of freeport with the one now in 'nomad/helper/freeport'.
2019-12-09 08:37:32 -06:00
Michael Schurter
aaffe61863 Merge branch 'master' into release-0102 2019-12-04 14:13:34 -08:00