Commit Graph

17489 Commits

Author SHA1 Message Date
Chris Baker
16472c026a wip: remove PolicyOverride from scaling request 2020-03-24 13:57:11 +00:00
Chris Baker
94381c0da8 wip: added tests for client methods around group scaling 2020-03-24 13:57:11 +00:00
Chris Baker
4406668b53 wip: add GET endpoint for job group scaling target 2020-03-24 13:57:10 +00:00
Luiz Aoqui
ef7cb0e098 wip: add job scale endpoint in client 2020-03-24 13:57:10 +00:00
Chris Baker
1c9bac9087 wip: added job.scale rpc endpoint, needs explicit test (tested via http now) 2020-03-24 13:57:09 +00:00
Chris Baker
8102849683 wip: working on job group scaling endpoint 2020-03-24 13:55:20 +00:00
Chris Baker
a715eb7820 wip: added policy get endpoint, added UUID to policy 2020-03-24 13:55:20 +00:00
Chris Baker
7ba9b94018 wip: test for scaling policy parsing 2020-03-24 13:55:19 +00:00
Chris Baker
e6d75cab9f wip: was incorrectly parsing ScalingPolicy 2020-03-24 13:55:19 +00:00
Chris Baker
0762386a50 wip: upsert/delete scaling policies on job upsert/delete 2020-03-24 13:55:18 +00:00
Chris Baker
ee1b091e35 WIP: adding ScalingPolicy to api/structs and state store 2020-03-24 13:55:18 +00:00
Mahmood Ali
08e5a9087f Merge pull request #7414 from hashicorp/b-network-mode-change
Detect network mode change
2020-03-24 09:46:40 -04:00
Lang Martin
259311e7b0 csi: volume/plugin list should return an empty array, not nil (#7443)
* nomad/csi_endpoint: return an empty list, not nil

* nomad/csi_endpoint_test: volume list returns non-nil
2020-03-23 21:21:40 -04:00
Lang Martin
ce9dbe619f csi: the scheduler allows a job with a volume write claim to be updated (#7438)
* nomad/structs/csi: split CanWrite into health, in use

* scheduler/scheduler: expose AllocByID in the state interface

* nomad/state/state_store_test

* scheduler/stack: SetJobID on the matcher

* scheduler/feasible: when a volume writer is in use, check if it's us

* scheduler/feasible: remove SetJob

* nomad/state/state_store: denormalize allocs before Claim

* nomad/structs/csi: return errors on claim, with context

* nomad/csi_endpoint_test: new alloc doesn't look like an update

* nomad/state/state_store_test: change test reference to CanWrite
2020-03-23 21:21:04 -04:00
Drew Bailey
9e7a2b95cf Merge pull request #7444 from hashicorp/rename-auditor
make auditor interface more explicit
2020-03-23 20:23:28 -04:00
Drew Bailey
8e4d72f914 rename struct field to auditor 2020-03-23 20:09:01 -04:00
Mahmood Ali
57e7c85b0a Merge pull request #7238 from hashicorp/vendor-hcl-20190228
Update github.com/hashicorp/hcl
2020-03-23 20:00:33 -04:00
Mahmood Ali
e82b3b7146 Merge pull request #7384 from hashicorp/vendoring-tweaks-20200318
Vendor package cleanup
2020-03-23 19:48:12 -04:00
Drew Bailey
dafbf36458 make auditor interface more explicit 2020-03-23 19:32:58 -04:00
Mahmood Ali
cd12632c65 Merge pull request #7417 from hashicorp/f-allocstatus-lifecycle
cli: show lifecycle info in alloc status
2020-03-23 16:33:17 -04:00
Mahmood Ali
7bdebc09fb vendor golang.org/x/crypto/ed25519/internal/edwards25519 2020-03-23 16:29:10 -04:00
Mahmood Ali
c056d777e3 vendor: remove appengine unused package 2020-03-23 16:28:43 -04:00
Mahmood Ali
755f9a64c3 explicitly set github.com/pkg/errors version 2020-03-23 16:28:42 -04:00
Mahmood Ali
120c24461e remove unused packages 2020-03-23 16:28:11 -04:00
Mahmood Ali
784e06c6ee update github.com/pkg/errors 2020-03-23 16:27:47 -04:00
Mahmood Ali
f795ce8940 bad package 2020-03-23 16:27:24 -04:00
Mahmood Ali
ff52961918 cli: show lifecycle info in alloc status
Display task lifecycle info in `nomad alloc status <alloc_id>` output.
I chose to embed it in the Task header and only add it for tasks with
lifecycle info.

Also, I chose to order the tasks in the following order:

1. prestart non-sidecar tasks
2. prestart sidecar tasks
3. main tasks

The tasks are sorted lexicographically within each tier.

Sample output:

```
$ nomad alloc status 6ec0eb52
ID                  = 6ec0eb52-e6c8-665c-169c-113d6081309b
Eval ID             = fb0caa98
Name                = lifecycle.cache[0]
[...]

Task "init" (prestart) is "dead"
Task Resources
CPU        Memory       Disk     Addresses
0/500 MHz  0 B/256 MiB  300 MiB
[...]

Task "some-sidecar" (prestart sidecar) is "running"
Task Resources
CPU        Memory          Disk     Addresses
0/500 MHz  68 KiB/256 MiB  300 MiB
[...]

Task "redis" is "running"
Task Resources
CPU         Memory           Disk     Addresses
10/500 MHz  984 KiB/256 MiB  300 MiB
[...]
```
2020-03-23 15:57:24 -04:00
Mahmood Ali
4bcb6ad5cf Merge pull request #7437 from hashicorp/ci-build-darwin
build darwin binaries in CI
2020-03-23 15:53:06 -04:00
Mahmood Ali
9069fec857 ci: fix darwin artifact path 2020-03-23 15:42:44 -04:00
Drew Bailey
a813b4696c Merge pull request #7436 from hashicorp/b-fix-compilation
fix compilation with  correct func
2020-03-23 14:37:08 -04:00
Drew Bailey
b0fc071026 fix compilation with correct func 2020-03-23 14:32:11 -04:00
Tim Gross
d23eaed85b Merge pull request #7012 from hashicorp/f-csi-volumes
Container Storage Interface Support
2020-03-23 14:19:46 -04:00
Drew Bailey
1f32c3c7ee Merge pull request #7419 from hashicorp/f-event-pkg
Audit config, seams for enterprise audit features
2020-03-23 14:01:52 -04:00
Lang Martin
1bef8b8879 csi: add mount_options to volumes and volume requests (#7398)
Add mount_options to both the volume definition on registration and to the volume block in the group where the volume is requested. If both are specified, the options provided in the request replace the options defined in the volume. They get passed to the NodePublishVolume, which causes the node plugin to actually mount the volume on the host.

Individual tasks just mount bind into the host mounted volume (unchanged behavior). An operator can mount the same volume with different options by specifying it twice in the group context.

closes #7007

* nomad/structs/volumes: add MountOptions to volume request

* jobspec/test-fixtures/basic.hcl: add mount_options to volume block

* jobspec/parse_test: add expected MountOptions

* api/tasks: add mount_options

* jobspec/parse_group: use hcl decode not mapstructure, mount_options

* client/allocrunner/csi_hook: pass MountOptions through

client/allocrunner/csi_hook: add a VolumeMountOptions

client/allocrunner/csi_hook: drop Options

client/allocrunner/csi_hook: use the structs options

* client/pluginmanager/csimanager/interface: UsageOptions.MountOptions

* client/pluginmanager/csimanager/volume: pass MountOptions in capabilities

* plugins/csi/plugin: remove todo 7007 comment

* nomad/structs/csi: MountOptions

* api/csi: add options to the api for parsing, match structs

* plugins/csi/plugin: move VolumeMountOptions to structs

* api/csi: use specific type for mount_options

* client/allocrunner/csi_hook: merge MountOptions here

* rename CSIOptions to CSIMountOptions

* client/allocrunner/csi_hook

* client/pluginmanager/csimanager/volume

* nomad/structs/csi

* plugins/csi/fake/client: add PrevVolumeCapability

* plugins/csi/plugin

* client/pluginmanager/csimanager/volume_test: remove debugging

* client/pluginmanager/csimanager/volume: fix odd merging logic

* api: rename CSIOptions -> CSIMountOptions

* nomad/csi_endpoint: remove a 7007 comment

* command/alloc_status: show mount options in the volume list

* nomad/structs/csi: include MountOptions in the volume stub

* api/csi: add MountOptions to stub

* command/volume_status_csi: clean up csiVolMountOption, add it

* command/alloc_status: csiVolMountOption lives in volume_csi_status

* command/node_status: display mount flags

* nomad/structs/volumes: npe

* plugins/csi/plugin: npe in ToCSIRepresentation

* jobspec/parse_test: expand volume parse test cases

* command/agent/job_endpoint: ApiTgToStructsTG needs MountOptions

* command/volume_status_csi: copy paste error

* jobspec/test-fixtures/basic: hclfmt

* command/volume_status_csi: clean up csiVolMountOption
2020-03-23 13:59:25 -04:00
Tim Gross
2cebb3e66f csi: improve error messages from scheduler (#7426) 2020-03-23 13:59:25 -04:00
Tim Gross
a280cf06eb csi: stub fingerprint on instance manager shutdown (#7388)
Run the plugin fingerprint one last time with a closed client during
instance manager shutdown. This will return quickly and will give us a
correctly-populated `PluginInfo` marked as unhealthy so the Nomad
client can update the server about plugin health.
2020-03-23 13:59:25 -04:00
Tim Gross
0f9983f230 csi: dynamically update plugin registration (#7386)
Allow for faster updates to plugin status when allocations become
terminal by listening for register/deregister events from the dynamic
plugin registry (which in turn are triggered by the plugin supervisor
hook).

The deregistration function closures that we pass up to the CSI plugin
manager don't properly close over the name and type of the
registration, causing monolith-type plugins to deregister only one of
their two plugins on alloc shutdown. Rebind plugin supervisor 
deregistration targets to fix that.

Includes log message and comment improvements
2020-03-23 13:59:25 -04:00
Lang Martin
2dc95485d2 csi: ACLs for plugin endpoints (#7380)
* acl/policy: add PolicyList for global ACLs

* acl/acl: plugin policy

* acl/acl: maxPrivilege is required to allow "list"

* nomad/csi_endpoint: enforce plugin access with PolicyPlugin

* nomad/csi_endpoint: check job ACL swapped params

* nomad/csi_endpoint_test: test alloc filtering

* acl/policy: add namespace csi-register-plugin

* nomad/job_endpoint: check csi-register-plugin ACL on registration

* nomad/job_endpoint_test: add plugin job cases
2020-03-23 13:59:25 -04:00
Lang Martin
4573cbeef9 csi: implement volume ACLs (#7339)
* acl/policy: add the volume ACL policies

* nomad/csi_endpoint: enforce ACLs for volume access

* nomad/search_endpoint_oss: volume acls

* acl/acl: add plugin read as a global policy

* acl/policy: add PluginPolicy global cap type

* nomad/csi_endpoint: check the global plugin ACL policy

* nomad/mock/acl: PluginPolicy

* nomad/csi_endpoint: fix list rebase

* nomad/core_sched_test: new test since #7358

* nomad/csi_endpoint_test: use correct permissions for list

* nomad/csi_endpoint: allowCSIMount keeps ACL checks together

* nomad/job_endpoint: check mount permission for jobs

* nomad/job_endpoint_test: need plugin read, too
2020-03-23 13:59:25 -04:00
Lang Martin
9c9a0c5eb5 csi: volume ids are only unique per namespace (#7358)
* nomad/state/schema: use the namespace compound index

* scheduler/scheduler: CSIVolumeByID interface signature namespace

* scheduler/stack: SetJob on CSIVolumeChecker to capture namespace

* scheduler/feasible: pass the captured namespace to CSIVolumeByID

* nomad/state/state_store: use namespace in csi_volume index

* nomad/fsm: pass namespace to CSIVolumeDeregister & Claim

* nomad/core_sched: pass the namespace in volumeClaimReap

* nomad/node_endpoint_test: namespaces in Claim testing

* nomad/csi_endpoint: pass RequestNamespace to state.*

* nomad/csi_endpoint_test: appropriately failed test

* command/alloc_status_test: appropriately failed test

* node_endpoint_test: avoid notTheNamespace for the job

* scheduler/feasible_test: call SetJob to capture the namespace

* nomad/csi_endpoint: ACL check the req namespace, query by namespace

* nomad/state/state_store: remove deregister namespace check

* nomad/state/state_store: remove unused CSIVolumes

* scheduler/feasible: CSIVolumeChecker SetJob -> SetNamespace

* nomad/csi_endpoint: ACL check

* nomad/state/state_store_test: remove call to state.CSIVolumes

* nomad/core_sched_test: job namespace match so claim gc works
2020-03-23 13:59:25 -04:00
Tim Gross
8b38fc2183 volumes: add task environment interpolation to volume_mount (#7364) 2020-03-23 13:59:25 -04:00
Tim Gross
72309e3e88 csi: implement controller detach RPCs (#7356)
This changeset implements the remaining controller detach RPCs: server-to-client and client-to-controller. The tests also uncovered a bug in our RPC for claims which is fixed here; the volume claim RPC is used for both claiming and releasing a claim on a volume. We should only submit a controller publish RPC when the claim is new and not when it's being released.
2020-03-23 13:59:25 -04:00
Tim Gross
ccbc219609 csi: e2e tests for EBS and EFS plugins (#7343)
This changeset provides two basic e2e tests for CSI plugins targeting
common AWS use cases.

The EBS test launches the EBS plugin (controller + nodes) and registers
an EBS volume as a Nomad CSI volume. We deploy a job that writes to
the volume, stop that job, and reuse the volume for another job which
should be able to read the data written by the first job.

The EFS test launches the EFS plugin (nodes-only) and registers an EFS
volume as a Nomad CSI volume. We deploy a job that writes to the
volume, stop that job, and reuse the volume for another job which
should be able to read the data written by the first job.

The writer jobs mount the CSI volume at a location within the alloc
dir.
2020-03-23 13:59:18 -04:00
Tim Gross
0d6b1756a5 csi: make claims on volumes idempotent for the same alloc (#7328)
Nomad clients will push node updates during client restart which can
cause an extra claim for a volume by the same alloc. If an alloc
already claims a volume, we can allow it to be treated as a valid
claim and continue.
2020-03-23 13:58:30 -04:00
Tim Gross
42323c41d9 csi: add dynamicplugins registry to client state store (#7330)
In order to correctly fingerprint dynamic plugins on client restarts,
we need to persist a handle to the plugin (that is, connection info)
to the client state store.

The dynamic registry will sync automatically to the client state
whenever it receives a register/deregister call.
2020-03-23 13:58:30 -04:00
Lang Martin
13dea1e5bb csi: use ExternalID, when set, to identify volumes for outside RPC calls (#7326)
* nomad/structs/csi: new RemoteID() uses the ExternalID if set

* nomad/csi_endpoint: pass RemoteID to volume request types

* client/pluginmanager/csimanager/volume: pass RemoteID to NodePublishVolume
2020-03-23 13:58:30 -04:00
Tim Gross
17e3e882b8 csi: docstring and log message fixups (#7327)
Fix some docstring typos and fix noisy log message during client restarts.
A log for the common case where the plugin socket isn't ready yet
isn't actionable by the operator so having it at info is just noise.
2020-03-23 13:58:30 -04:00
Lang Martin
ce8625cf9c csi: change the API paths to match CLI command layout (#7325)
* command/agent/csi_endpoint: support type filter in volumes & plugins

* command/agent/http: use /v1/volume/csi & /v1/plugin/csi

* api/csi: use /v1/volume/csi & /v1/plugin/csi

* api/nodes: use /v1/volume/csi & /v1/plugin/csi

* api/nodes: not /volumes/csi, just /volumes

* command/agent/csi_endpoint: fix ot parameter parsing
2020-03-23 13:58:30 -04:00
Lang Martin
13e37865b7 csi: volumes listed in nomad node status (#7318)
* api/allocations: GetTaskGroup finds the taskgroup struct

* command/node_status: display CSI volume names

* nomad/state/state_store: new CSIVolumesByNodeID

* nomad/state/iterator: new SliceIterator type implements memdb.ResultIterator

* nomad/csi_endpoint: deal with a slice of volumes

* nomad/state/state_store: CSIVolumesByNodeID return a SliceIterator

* nomad/structs/csi: CSIVolumeListRequest takes a NodeID

* nomad/csi_endpoint: use the return iterator

* command/agent/csi_endpoint: parse query params for CSIVolumes.List

* api/nodes: new CSIVolumes to list volumes by node

* command/node_status: use the new list endpoint to print volumes

* nomad/state/state_store: error messages consider the operator

* command/node_status: include the Provider
2020-03-23 13:58:30 -04:00
Lang Martin
7083cdd8cb csi: csi-hostpath plugin unimplemented error on controller publish (#7299)
* client/allocrunner/csi_hook: tag errors

* nomad/client_csi_endpoint: tag errors

* nomad/client_rpc: remove an unnecessary error tag

* nomad/state/state_store: ControllerRequired fix intent

We use ControllerRequired to indicate that a volume should use the
publish/unpublish workflow, rather than that it has a controller. We
need to check both RequiresControllerPlugin and SupportsAttachDetach
from the fingerprint to check that.

* nomad/csi_endpoint: tag errors

* nomad/csi_endpoint_test: longer error messages, mock fingerprints
2020-03-23 13:58:30 -04:00