Commit Graph

18187 Commits

Author SHA1 Message Date
Tim Gross
eb2f77a011 csi: use a blocking initial connection with timeout
The plugin supervisor lazily connects to plugins, but this means we
only get "Unavailable" back from the gRPC call in cases where the
plugin can never be reached (for example, if the Nomad client has the
wrong permissions for the socket).

This changeset improves the operator experience by switching to a
blocking `DialWithContext`. It eagerly connects so that we can
validate the connection is real and get a "failed to open" error in
case where Nomad can't establish the initial connection.
2020-05-14 15:59:19 -04:00
Tim Gross
c514a5527a csi: refactor internal client field name to ExternalID (#7958)
The CSI plugins RPCs require the use of the storage provider's volume
ID, rather than the user-defined volume ID. Although changing the RPCs
to use the field name `ExternalID` risks breaking backwards
compatibility, we can use the `ExternalID` name internally for the
client and only use `VolumeID` at the RPC boundaries.
2020-05-14 11:56:07 -04:00
Tim Gross
89972866d3 e2e: upgrade CNI to 0.8.6 (#7956) 2020-05-14 09:29:11 -04:00
Chris Baker
54987b3e58 Merge pull request #7952 from hashicorp/d/ui-changelog-0.11.2
Changelog additions for bugs and improvements to the UI
2020-05-13 18:54:10 -05:00
Chris Baker
1f4a9dfa7d Merge pull request #7915 from hashicorp/b-scaling-api-missing-count
the api.ScalingEvent struct was missing the .Count field
2020-05-13 18:52:38 -05:00
Michael Lange
caff2b096d Changelog additions for bugs and improvements to the UI 2020-05-13 15:40:10 -07:00
Chris Baker
50e060809e added changelog entry 2020-05-13 20:46:06 +00:00
Chris Baker
e6b14ed35c the api.ScalingEvent struct was missing the .Count field 2020-05-13 20:44:53 +00:00
Chris Baker
5bd876acb0 Merge pull request #7950 from hashicorp/docs-dst
docs: clarify periodic dst behavior
2020-05-13 15:44:41 -05:00
Chris Baker
770a7a60ea Merge pull request #7948 from hashicorp/changelog_stop_after_client_disconnect
changelog entry for `stop_after_client_disconnect`
2020-05-13 15:43:17 -05:00
Tim Gross
55cf0a6e43 changelog entry for stop_after_client_disconnect 2020-05-13 16:41:59 -04:00
Seth Hoenig
da944fb30c changelog entry for aws cpu perf (#7949)
* changelog entry for `stop_after_client_disconnect`
* changelog entry for aws cpu perf

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2020-05-13 16:39:34 -04:00
Tim Gross
2209ef3342 docs for stop_on_client_disconnect stanza (#7938) 2020-05-13 16:39:24 -04:00
Lang Martin
cd6d34425f server: stop after client disconnect (#7939)
* jobspec, api: add stop_after_client_disconnect

* nomad/state/state_store: error message typo

* structs: alloc methods to support stop_after_client_disconnect

1. a global AllocStates to track status changes with timestamps. We
   need this to track the time at which the alloc became lost
   originally.

2. ShouldClientStop() and WaitClientStop() to actually do the math

* scheduler/reconcile_util: delayByStopAfterClientDisconnect

* scheduler/reconcile: use delayByStopAfterClientDisconnect

* scheduler/util: updateNonTerminalAllocsToLost comments

This was setup to only update allocs to lost if the DesiredStatus had
already been set by the scheduler. It seems like the intention was to
update the status from any non-terminal state, and not all lost allocs
have been marked stop or evict by now

* scheduler/testing: AssertEvalStatus just use require

* scheduler/generic_sched: don't create a blocked eval if delayed

* scheduler/generic_sched_test: several scheduling cases
2020-05-13 16:39:04 -04:00
Michael Schurter
1b3e969355 docs: clarify periodic dst behavior 2020-05-13 13:24:35 -07:00
Chris Baker
e3f9bebad1 Merge pull request #7945 from hashicorp/docs-cronexpr-dst-fix
Document daylight saving handling
2020-05-13 14:13:29 -05:00
Chris Baker
ab8057903a changelog: reordered alphabetically 2020-05-13 19:12:21 +00:00
Michael Lange
1e7f1871bd Merge pull request #7942 from hashicorp/b-ui/csi-alloc-relationships
UI: CSI Bug, Imperatively load controller/node plugin allocations
2020-05-13 10:20:09 -07:00
Michael Lange
5456147e94 Merge pull request #7911 from hashicorp/f-ui/csi-availability-gauge
UI: CSI Availability Gauges
2020-05-13 10:18:17 -07:00
Mahmood Ali
a8e2da894c update changelog
[ci skip]
2020-05-13 12:54:10 -04:00
Mahmood Ali
71037b454b Merge pull request #7947 from hashicorp/b-docker-image-cleanup
docker: Fix docker image gc tracking
2020-05-13 12:50:59 -04:00
Mahmood Ali
72c08e0591 docker: Fix docker image gc tracking
This fixes a bug where docker images may not be GCed.  The cause of the
bug is that we track the task using `task.ID+task.Name` on task start
but remove on plain `task.ID`.

This haromize the two paths by using `task.ID`, as it's unique enough
and it's also used in the `loadImage` path (path when loading an image
from a local tarball instead of dockerhub).
2020-05-13 12:33:17 -04:00
Michael Lange
0a258b1a9f Test coverage for the gauge chart 2020-05-13 08:36:05 -07:00
Michael Lange
b3475add53 Adjust gauge chart stories 2020-05-13 08:36:05 -07:00
Michael Lange
83cd585682 Add gauge charts to the plugin detail page to measure availability 2020-05-13 08:36:05 -07:00
Michael Lange
df3c24f968 Bottom aligned columns variant 2020-05-13 08:36:05 -07:00
Michael Lange
4e7354117a Add gauge chart stories 2020-05-13 08:36:05 -07:00
Michael Lange
fe26e904bb Style the gauge chart component 2020-05-13 08:36:05 -07:00
Michael Lange
72a928c5ec Treat null and undefined equally 2020-05-13 08:36:04 -07:00
Michael Lange
7e93f9033d Refactor metrics styles to allow for standalone metrics 2020-05-13 08:36:04 -07:00
Michael Lange
dfc45f4dcd Gauge chart component 2020-05-13 08:36:04 -07:00
Drew Bailey
e72effc4ce Merge pull request #7946 from hashicorp/ci/pin-golangci-lint
pin golangci-lint dep to 1.24.0
2020-05-13 10:45:26 -04:00
Drew Bailey
f96960cab7 pin golangci-lint dep to 1.24.0 2020-05-13 10:43:39 -04:00
Mahmood Ali
3cb555144c Merge pull request #7944 from hashicorp/b-health-checks-after-task-health
Allocs are healthy if service checks get healthy before task health
2020-05-13 09:34:03 -04:00
Mahmood Ali
31a8a861ea document daylight saving change 2020-05-13 08:21:19 -04:00
Mahmood Ali
22b65f22d6 allochealth: Fix when check health preceeds task health
Fix a bug where if the alloc check becomes healthy before the task health, the
alloc may never be considered healthy.
2020-05-13 07:44:39 -04:00
Mahmood Ali
d4e4563d50 tests: tests for health check sequencing
Add a failing tests to show that if an alloc checks is marked healthy before the
alloc tasks start up, the alloc may be forever considered unhealthy.
2020-05-13 07:43:00 -04:00
Michael Lange
30715b8b37 Test coverage for the plugin-allocation-row 2020-05-12 21:30:33 -07:00
Michael Lange
db8e43949d Don't double load freshly loaded allocations 2020-05-12 21:30:33 -07:00
Michael Lange
32b4e5e8ab Properly manage the lifecycle of allocations for storage nodes and controllers 2020-05-12 21:30:33 -07:00
Michael Lange
cb6b9dc1f2 Key allocation rows to prevent unnecessary re-renders 2020-05-12 21:30:32 -07:00
Mahmood Ali
cf47153b52 Merge pull request #7894 from hashicorp/b-cronexpr-dst-fix
Fix Daylight saving transition handling
2020-05-12 16:36:11 -04:00
Mahmood Ali
1e7ebf5f55 vendor: use tagged cronexpr, v1.1.0
Also, update to the version with modification notice
2020-05-12 16:20:00 -04:00
Jeff Escalante
5bc75fdbde fix formatting error on preemption docs page 2020-05-12 14:08:55 -04:00
Drew Bailey
c58774f26c Merge pull request #7936 from josegonzalez/patch-1
docs: add note that only system job preemption is available in OSS
2020-05-12 13:29:47 -04:00
Jose Diaz-Gonzalez
675d54a3c2 Update website/pages/docs/internals/scheduling/preemption.mdx
Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>
2020-05-12 13:21:15 -04:00
Jose Diaz-Gonzalez
c0146fa8ca docs: add note that only system job preemption is available in OSS 2020-05-12 13:02:13 -04:00
Mahmood Ali
99bf86a48a update changelog (#7934) 2020-05-12 12:22:22 -04:00
Mahmood Ali
dd06346435 Merge pull request #7932 from hashicorp/f-docker-custom-runtimes
Docker runtimes
2020-05-12 11:59:36 -04:00
Mahmood Ali
44c93e3598 update tests 2020-05-12 11:39:09 -04:00