Commit Graph

18006 Commits

Author SHA1 Message Date
Charlie Voiselle
6571ccefbc Add SchedulerAlgorithm to SchedulerConfig 2020-05-01 13:13:29 -04:00
Drew Bailey
1bb731bdc5 Merge pull request #7778 from hashicorp/license-cli
License cli
2020-05-01 08:51:40 -04:00
Michael Schurter
cbcd3eb06a Merge pull request #7730 from hashicorp/b-reserved-scoring
core: fix node reservation scoring
2020-04-30 14:48:36 -07:00
Michael Schurter
e3cba0c5be Merge branch 'master' into b-reserved-scoring 2020-04-30 14:48:14 -07:00
Michael Schurter
26d34f088e Update website/pages/docs/upgrade/upgrade-specific.mdx
Co-authored-by: Alex Dadgar <alex@hashicorp.com>
2020-04-30 14:47:12 -07:00
Tim Gross
f592dd9021 csi: check returned volume capability validation (#7831)
This changeset corrects handling of the `ValidationVolumeCapabilities`
response:

* The CSI spec for the `ValidationVolumeCapabilities` requires that
  plugins only set the `Confirmed` field if they've validated all
  capabilities. The Nomad client improperly assumes that the lack of a
  `Confirmed` field should be treated as a failure. This breaks the
  Azure and Linode block storage plugins, which don't set this
  optional field.

* The CSI spec also requires that the orchestrator check the validation
  responses to guard against older versions of a plugin reporting
  "valid" for newer fields it doesn't understand.
2020-04-30 17:12:32 -04:00
Tim Gross
5731be4b79 csi: restore long timeout for controller plugins (#7840)
During MVP development, we reduced the timeout for controller plugins
to avoid long hangs in GC workers. But now that this work has been
moved to the volume watcher, we can restore the original timeout which
is better suited for the characteristic timescales of some cloud
provider APIs and better matches the behavior of k8s.
2020-04-30 17:12:05 -04:00
Tim Gross
610e0a6762 csi: ensure Read/WriteAllocs aren't released early (#7841)
We should only remove the `ReadAllocs`/`WriteAllocs` values for a
volume after the claim has entered the "ready to free"
state. The volume will eventually be released as expected. But
querying the volume API will show the volume is released before the
controller unpublish has finished and this can cause a race with
starting new jobs.

Test updates are to cover cases where we're dropping claims but not
running through the whole reaping process.
2020-04-30 17:11:31 -04:00
Drew Bailey
5372a52223 properly format license output 2020-04-30 14:46:26 -04:00
Drew Bailey
105345ab60 allow test to check if server is enterprise 2020-04-30 14:46:21 -04:00
Drew Bailey
3876c1a68d add license reset command to commands
help text formatting

remove reset

no signed option
2020-04-30 14:46:20 -04:00
Drew Bailey
7561bf97ff test all commands oss err 2020-04-30 14:46:19 -04:00
Drew Bailey
d15927bf9e hcl fmt from editor
license cli formatting, license endpoints ent only

test oss error

type assertions
2020-04-30 14:46:18 -04:00
Drew Bailey
8b222d79d5 license cli commands
cli changes, formatting
2020-04-30 14:46:17 -04:00
Jasmine Dahilig
c10ac6394f UI: Add representations for task lifecycles (#7659)
This adds details about task lifecycles to allocations, task groups,
and tasks. It includes a live-updating timeline-like chart on allocations.
2020-04-30 08:15:19 -05:00
Tim Gross
775de0d1c2 csi: move volume claim release into volumewatcher (#7794)
This changeset adds a subsystem to run on the leader, similar to the
deployment watcher or node drainer. The `Watcher` performs a blocking
query on updates to the `CSIVolumes` table and triggers reaping of
volume claims.

This will avoid tying up scheduling workers by immediately sending
volume claim workloads into their own loop, rather than blocking the
scheduling workers in the core GC job doing things like talking to CSI
controllers

The volume watcher is enabled on leader step-up and disabled on leader
step-down.

The volume claim GC mechanism now makes an empty claim RPC for the
volume to trigger an index bump. That in turn unblocks the blocking
query in the volume watcher so it can assess which claims can be
released for a volume.
2020-04-30 09:13:00 -04:00
Michael Lange
25a74ece57 Merge pull request #7820 from hashicorp/b-ui/ui-log-races
UI: Log streaming bug fix medley
2020-04-29 18:06:47 -07:00
Michael Lange
9557475233 Make the no connection error on the logs page dismissable 2020-04-29 17:36:17 -07:00
Michael Lange
e5a5fc7744 Fix race condition where stdout and stderr requests can cause a no connection error
This would happen because a no connection error happens after the second request fails, but
that's because it's assumed the second request is to a server node. However, if a user clicks
stderr fast enough, the first and second requests are both to the client node. This changes
the logic to check if the request is to the server before deeming log streaming a total failure.
2020-04-29 17:36:17 -07:00
Michael Lange
e186554651 Clicking stdout/stderr when already on that tab is now a noop 2020-04-29 17:36:16 -07:00
Michael Lange
fab6fcbd88 Abort log fetch request when failing over from client to server
Typically a failover means that the client can't be reached. However, if
the client does eventually return after the timeout period, the log will
stream indefinitely. This fixes that using an API that wasn't broadly
available at the time this was first written.
2020-04-29 17:34:49 -07:00
Michael Lange
6af31fed7a Always pass credential in fetch requests, but also treat options reasonably
Now options can be provided without also having to remember to pass
credentials. This is convenient for abort controller signals.
2020-04-29 17:34:49 -07:00
Seth Hoenig
a12eb8fdae Merge pull request #7828 from hashicorp/b-ec2-speeds
env_aws: use best-effort lookup table for CPU performance in EC2
2020-04-29 11:25:54 -06:00
Seth Hoenig
a869394a03 env_aws: combine 3 log lines into 1 2020-04-29 10:47:36 -06:00
Seth Hoenig
0d5d1781d3 env_aws: downgrade log line
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
2020-04-29 10:34:26 -06:00
Seth Hoenig
f47c57fa2d env_aws: fixup log line
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
2020-04-29 10:33:53 -06:00
Tim Gross
36e3d13b2b csi: read-repair CSI volume claims (#7824)
The `CSIVolumeClaim` fields were added after 0.11.1, so claims made
before that may be missing the value. Repair this when we read the
volume out of the state store.

The `NodeID` field was added after 0.11.0, so we need to ensure it's
been populated during upgrades from 0.11.0.
2020-04-29 11:57:19 -04:00
Buck Doyle
d913f05503 UI: Fix exec popup link for job id ≠ name (#7815)
This closes #7814. It makes URL-generation more central and changes
the exec URL to include job id instead of name.
2020-04-29 07:54:04 -05:00
Mahmood Ali
5b8b86f3f8 Merge pull request #7829 from ccn/vendor-go-dockerclient-v1.6.5
Vendor: update fsouza/go-dockerclient to v1.6.5
2020-04-29 08:48:40 -04:00
ccn
efd04510d6 Remove unused internal subpackages 2020-04-29 20:21:44 +08:00
ccn
faab1cd76e Vendor: update fsouza/go-dockerclient to v1.6.5 2020-04-29 18:54:55 +08:00
Seth Hoenig
9230fa9eff env_aws: use best-effort lookup table for CPU performance in EC2
Fixes #7681

The current behavior of the CPU fingerprinter in AWS is that it
reads the **current** speed from `/proc/cpuinfo` (`CPU MHz` field).

This is because the max CPU frequency is not available by reading
anything on the EC2 instance itself. Normally on Linux one would
look at e.g. `sys/devices/system/cpu/cpuN/cpufreq/cpuinfo_max_freq`
or perhaps parse the values from the `CPU max MHz` field in
`/proc/cpuinfo`, but those values are not available.

Furthermore, no metadata about the CPU is made available in the
EC2 metadata service.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-categories.html

Since `go-psutil` cannot determine the max CPU speed it defaults to
the current CPU speed, which could be basically any number between
0 and the true max. This is particularly bad on large, powerful
reserved instances which often idle at ~800 MHz while Nomad does
its fingerprinting (typically IO bound), which Nomad then uses as
the max, which results in severe loss of available resources.

Since the CPU specification is unavailable programmatically (at least
not without sudo) use a best-effort lookup table. This table was
generated by going through every instance type in AWS documentation
and copy-pasting the numbers.
https://aws.amazon.com/ec2/instance-types/

This approach obviously is not ideal as future instance types will
need to be added as they are introduced to AWS. However, using the
table should only be an improvement over the status quo since right
now Nomad miscalculates available CPU resources on all instance types.
2020-04-28 19:01:33 -06:00
Mahmood Ali
2a81c12465 Merge pull request #7827 from hashicorp/deps-go-msgpack-v1.1.5
Harmonize go-msgpack/codec/codecgen
2020-04-28 18:13:09 -04:00
Mahmood Ali
1fd22623cd Harmonize go-msgpack/codec/codecgen
Use v1.1.5 of go-msgpack/codec/codecgen, so go-msgpack codecgen matches
the library version.

We branched off earlier to pick up
f51b518921
, but apparently that's not needed as we could customize the package via
`-c` argument.
2020-04-28 17:12:31 -04:00
Tim Gross
407e02c723 e2e: add helper to Makefile for local file deployments (#7822) 2020-04-28 16:15:58 -04:00
Lang Martin
1bcb8f5afb command: deployment status without a prefix lists deployments (#7821) 2020-04-28 15:11:32 -04:00
Mahmood Ali
67c1b93c87 Merge pull request #7818 from greut/codegen
structs: give codecgen import
2020-04-28 12:16:41 -04:00
Buck Doyle
edff4cc78c UI: update exec styles to match conventions (#7811) 2020-04-28 08:33:07 -05:00
Chris Baker
f8a690ebab Merge pull request #7816 from hashicorp/b-7789-job-scaling-status-issues
fix issues in Job.ScaleStatus
2020-04-28 06:33:42 -05:00
Yoan Blanc
f778a5be55 structs: give codecgen import
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-04-28 08:23:20 +02:00
Nick Ethier
31ddf77fdd nomad: build dynamic port for exposed checks if not specified (#7800) 2020-04-28 00:07:41 -04:00
Chris Baker
40e1db38e9 updated changelog 2020-04-27 21:46:56 +00:00
Chris Baker
d623b4bf96 modified Job.ScaleStatus to ignore deployments and look directly at the
allocations, ignoring canaries
2020-04-27 21:45:39 +00:00
Charlie Voiselle
738f3cb0ac Adding API homepage to sidebar. 2020-04-27 13:41:11 -04:00
Charlie Voiselle
943787a340 Merge pull request #7801 from hashicorp/d-fix-docker-credhelper-example
[docs] Update credential helper example in docker.mdx
2020-04-27 11:44:54 -04:00
Mahmood Ali
51b6ba8e70 Merge pull request #7809 from greut/typos
api: fix some documentation typos
2020-04-27 08:50:25 -04:00
Mahmood Ali
732ec7118a Merge pull request #7805 from hashicorp/vendor-go-metrics-v0.3.3
Vendor: update armon/go-metrics to v0.3.3
2020-04-27 08:49:50 -04:00
Yoan Blanc
26c1aebc69 api: fix some documentation typos
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-04-27 10:25:29 +02:00
Mahmood Ali
69a9d3c507 Vendor: update armon/go-metrics to v0.3.3
To pick up a lock contention fix in prometheus sink:
https://github.com/armon/go-metrics/pull/107 .
2020-04-26 08:54:50 -04:00
Charlie Voiselle
74d91fa2b4 Update docker.mdx 2020-04-24 23:20:02 -04:00