Commit Graph

1633 Commits

Author SHA1 Message Date
Chris Baker
b92046dba8 Merge branch 'master' of github.com:hashicorp/nomad into release-0.12.0 2020-07-08 21:16:31 +00:00
Nick Ethier
3367f6d94a docs: add CNI and host_network docs (#8391)
Co-authored-by: Seth Hoenig <shoenig@hashicorp.com>
2020-07-08 15:45:04 -04:00
Nomad Release bot
0060a5306a Generate files for 0.12.0-rc1 release 2020-07-07 03:17:05 +00:00
Nick Ethier
e94690decb ar: support opting into binding host ports to default network IP (#8321)
* ar: support opting into binding host ports to default network IP

* fix config plumbing

* plumb node address into network resource

* struct: only handle network resource upgrade path once
2020-07-06 18:51:46 -04:00
Tim Gross
144f8b88ff fix region flag vs job region handling in plan/submit (#8347) 2020-07-06 15:46:09 -04:00
Chris Baker
7f8176a188 changes to make sure that Max is present and valid, to improve error messages
* made api.Scaling.Max a pointer, so we can detect (and complain) when it is neglected
* added checks to HCL parsing that it is present
* when Scaling.Max is absent/invalid, don't return extraneous error messages during validation
* tweak to multiregion handling to ensure that the count is valid on the interpolated regional jobs

resolves #8355
2020-07-04 19:05:50 +00:00
Mahmood Ali
49a2c65d6a tests: make testagent shutdown idempotent
Avoid double freeing ports if an agent.Shutdown() is called multiple
times.
2020-07-03 09:16:01 -04:00
Lang Martin
bde973e366 api: nomad debug new /agent/host (#8325)
* command/agent/host: collect host data, multi platform

* nomad/structs/structs: new HostDataRequest/Response

* client/agent_endpoint: add RPC endpoint

* command/agent/agent_endpoint: add Host

* api/agent: add the Host endpoint

* nomad/client_agent_endpoint: add Agent Host with forwarding

* nomad/client_agent_endpoint: use findClientConn

This changes forwardMonitorClient and forwardProfileClient to use
findClientConn, which was cribbed from the common parts of those
funcs.

* command/debug: call agent hosts

* command/agent/host: eliminate calling external programs
2020-07-02 09:51:25 -04:00
Tim Gross
95799663b8 csi: add -force flag to volume deregister (#8295)
The `nomad volume deregister` command currently returns an error if the volume
has any claims, but in cases where the claims can't be dropped because of
plugin errors, providing a `-force` flag gives the operator an escape hatch.

If the volume has no allocations or if they are all terminal, this flag
deletes the volume from the state store, immediately and implicitly dropping
all claims without further CSI RPCs. Note that this will not also
unmount/detach the volume, which we'll make the responsibility of a separate
`nomad volume detach` command.
2020-07-01 12:17:51 -04:00
Tim Gross
f18849623b update compiled static assets 2020-06-24 16:37:13 -04:00
Tim Gross
cede9b5fb4 multiregion validation fixes (#8265)
Multi-region jobs need to bypass validating counts otherwise we get spurious
warnings in Job.Plan.
2020-06-24 12:18:51 -04:00
Seth Hoenig
3a1129c6b7 connect/native: fixup command/agent/consul/connect test cases 2020-06-24 09:05:56 -05:00
Seth Hoenig
520d35e085 consul/connect: split connect native flag and task in service 2020-06-23 10:22:22 -05:00
Seth Hoenig
7e8d5c2392 consul/connect: add support for running connect native tasks
This PR adds the capability of running Connect Native Tasks on Nomad,
particularly when TLS and ACLs are enabled on Consul.

The `connect` stanza now includes a `native` parameter, which can be
set to the name of task that backs the Connect Native Consul service.

There is a new Client configuration parameter for the `consul` stanza
called `share_ssl`. Like `allow_unauthenticated` the default value is
true, but recommended to be disabled in production environments. When
enabled, the Nomad Client's Consul TLS information is shared with
Connect Native tasks through the normal Consul environment variables.
This does NOT include auth or token information.

If Consul ACLs are enabled, Service Identity Tokens are automatically
and injected into the Connect Native task through the CONSUL_HTTP_TOKEN
environment variable.

Any of the automatically set environment variables can be overridden by
the Connect Native task using the `env` stanza.

Fixes #6083
2020-06-22 14:07:44 -05:00
Michael Schurter
6886edd1fc Merge pull request #8208 from hashicorp/f-multi-network
multi-interface network support
2020-06-19 15:46:48 -07:00
Mahmood Ali
0821c0a7e3 Merge pull request #8082 from hashicorp/f-raft-multipler
Implement raft multipler flag
2020-06-19 10:04:59 -04:00
Nick Ethier
ad8ced3873 multi-interface network support 2020-06-19 09:42:10 -04:00
Mahmood Ali
9c2c03724b Merge pull request #8192 from hashicorp/f-status-allnamespaces-2
CLI Allow querying all namespaces for jobs and allocations - Try 2
2020-06-18 20:16:52 -04:00
Nick Ethier
e9ff8a8daa Task DNS Options (#7661)
Co-Authored-By: Tim Gross <tgross@hashicorp.com>
Co-Authored-By: Seth Hoenig <shoenig@hashicorp.com>
2020-06-18 11:01:31 -07:00
Mahmood Ali
2f9fc04e05 use '*' to indicate all namespaces
This reverts the introduction of AllNamespaces parameter that was merged
earlier but never got released.
2020-06-17 16:27:43 -04:00
Tim Gross
45c2e875f8 multiregion: change AutoRevert to OnFailure 2020-06-17 11:05:45 -04:00
Tim Gross
02209b1371 Multiregion job registration
Integration points for multiregion jobs to be registered in the enterprise
version of Nomad:
* hook in `Job.Register` for enterprise to send job to peer regions
* remove monitoring from `nomad job run` and `nomad job stop` for multiregion jobs
2020-06-17 11:04:58 -04:00
Tim Gross
e620ff7f0f use constants from http package 2020-06-17 11:04:02 -04:00
Tim Gross
c0974fe9af multiregion CLI: nomad deployment unblock 2020-06-17 11:03:44 -04:00
Drew Bailey
ce8f230cab Multiregion deploy status and job status CLI 2020-06-17 11:03:34 -04:00
Tim Gross
f64f5a645c Multiregion structs
Initial struct definitions, jobspec parsing, validation, and conversion
between Nomad structs and API structs for multi-region deployments.
2020-06-17 11:00:14 -04:00
Chris Baker
8590ae3d60 wip: added PreserveCounts to struct.JobRegisterRequest, development test for Job.Register 2020-06-16 18:45:17 +00:00
Mahmood Ali
519447d1c0 tests: prefix agent logs to identify agent sources 2020-06-07 16:38:11 -04:00
Mahmood Ali
97fb054c9d basic snapshot restore 2020-06-07 15:46:23 -04:00
Mahmood Ali
3b04afee2e Merge pull request #8047 from hashicorp/f-snapshot-save
API for atomic snapshot backups
2020-06-01 07:55:16 -04:00
Mahmood Ali
65937ffd21 Merge pull request #8001 from hashicorp/f-jobs-list-across-nses
endpoint to expose all jobs across all namespaces
2020-05-31 21:28:03 -04:00
Mahmood Ali
a9cf263888 implement raft multiplier 2020-05-31 12:24:27 -04:00
Drew Bailey
967facc55e removes pro tags (#8014) 2020-05-28 15:40:17 -04:00
Drew Bailey
7fc495e30e Oss license support for ent builds (#8054)
* changes necessary to support oss licesning shims

revert nomad fmt changes

update test to work with enterprise changes

update tests to work with new ent enforcements

make check

update cas test to use scheduler algorithm

back out preemption changes

add comments

* remove unused method
2020-05-27 13:46:52 -04:00
Mahmood Ali
f4fcc1c02c Endpoint for snapshotting server state 2020-05-21 20:04:38 -04:00
James Rasell
87d51e6898 api: return custom error if API attempts to decode empty body. 2020-05-19 15:46:31 +02:00
Mahmood Ali
9813a55d44 endpoint to expose all jobs across all namespaces
Allow a `/v1/jobs?all_namespaces=true` to list all jobs across all
namespaces.  The returned list is to contain a `Namespace` field
indicating the job namespace.

If ACL is enabled, the request token needs to be a management token or
have `namespace:list-jobs` capability on all existing namespaces.
2020-05-18 13:50:46 -04:00
Nomad Release bot
807cfebe90 Generate files for 0.11.2 release 2020-05-14 20:49:42 +00:00
Mahmood Ali
1d43126e00 always check default_scheduler_config config
Also, avoid early return on validation to avoid masking some validation
bugs in dev setup.
2020-05-14 14:16:12 -04:00
Lang Martin
cd6d34425f server: stop after client disconnect (#7939)
* jobspec, api: add stop_after_client_disconnect

* nomad/state/state_store: error message typo

* structs: alloc methods to support stop_after_client_disconnect

1. a global AllocStates to track status changes with timestamps. We
   need this to track the time at which the alloc became lost
   originally.

2. ShouldClientStop() and WaitClientStop() to actually do the math

* scheduler/reconcile_util: delayByStopAfterClientDisconnect

* scheduler/reconcile: use delayByStopAfterClientDisconnect

* scheduler/util: updateNonTerminalAllocsToLost comments

This was setup to only update allocs to lost if the DesiredStatus had
already been set by the scheduler. It seems like the intention was to
update the status from any non-terminal state, and not all lost allocs
have been marked stop or evict by now

* scheduler/testing: AssertEvalStatus just use require

* scheduler/generic_sched: don't create a blocked eval if delayed

* scheduler/generic_sched_test: several scheduling cases
2020-05-13 16:39:04 -04:00
Tim Gross
a28f18ea1d csi: support Secrets parameter in CSI RPCs (#7923)
CSI plugins can require credentials for some publishing and
unpublishing workflow RPCs. Secrets are configured at the time of
volume registration, stored in the volume struct, and then passed
around as an opaque map by Nomad to the plugins.
2020-05-11 17:12:51 -04:00
Mahmood Ali
3ee7379b10 Merge pull request #7912 from hashicorp/f-scheduler-algorithm-followup
Scheduler Algorithm Defaults handling and docs
2020-05-11 09:30:58 -04:00
Tim Gross
8192aa602e Periodic GC for volume claims (#7881)
This changeset implements a periodic garbage collection of CSI volumes
with missing allocations. This can happen in a scenario where a node
update fails partially and the allocation updates are written to raft
but the evaluations to GC the volumes are dropped. This feature will
cover this edge case and ensure that upgrades from 0.11.0 and 0.11.1
get any stray claims cleaned up.
2020-05-11 08:20:50 -04:00
Mahmood Ali
ad72ee93c9 handle upgrade path and defaults
Ensure that `""` Scheduler Algorithm gets explicitly set to binpack on
upgrades or on API handling when user misses the value.

The scheduler already treats `""` value as binpack.  This PR merely
ensures that the operator API returns the effective value.
2020-05-09 12:34:08 -04:00
Tim Gross
9990650b52 periodic GC for CSI plugins (#7878)
This changeset implements a periodic garbage collection of unused CSI
plugins. Plugins are self-cleaning when the last allocation for a
plugin is stopped, but this feature will cover any missing edge cases
and ensure that upgrades from 0.11.0 and 0.11.1 get any stray plugins
cleaned up.
2020-05-06 16:49:12 -04:00
Mahmood Ali
5078e0cfed tests and some clean up 2020-05-01 13:13:30 -04:00
Charlie Voiselle
6571ccefbc Add SchedulerAlgorithm to SchedulerConfig 2020-05-01 13:13:29 -04:00
Drew Bailey
105345ab60 allow test to check if server is enterprise 2020-04-30 14:46:21 -04:00
Drew Bailey
d15927bf9e hcl fmt from editor
license cli formatting, license endpoints ent only

test oss error

type assertions
2020-04-30 14:46:18 -04:00
Mahmood Ali
41bec868a8 http: adjust log level for request failure
Failed requests due to API client errors are to be marked as DEBUG.

The Error log level should be reserved to signal problems with the
cluster and are actionable for nomad system operators.  Logs due to
misbehaving API clients don't represent a system level problem and seem
spurius to nomad maintainers at best.  These log messages can also be
attack vectors for deniel of service attacks by filling servers disk
space with spurious log messages.
2020-04-22 16:19:59 -04:00