Commit Graph

17972 Commits

Author SHA1 Message Date
Seth Hoenig
9230fa9eff env_aws: use best-effort lookup table for CPU performance in EC2
Fixes #7681

The current behavior of the CPU fingerprinter in AWS is that it
reads the **current** speed from `/proc/cpuinfo` (`CPU MHz` field).

This is because the max CPU frequency is not available by reading
anything on the EC2 instance itself. Normally on Linux one would
look at e.g. `sys/devices/system/cpu/cpuN/cpufreq/cpuinfo_max_freq`
or perhaps parse the values from the `CPU max MHz` field in
`/proc/cpuinfo`, but those values are not available.

Furthermore, no metadata about the CPU is made available in the
EC2 metadata service.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-categories.html

Since `go-psutil` cannot determine the max CPU speed it defaults to
the current CPU speed, which could be basically any number between
0 and the true max. This is particularly bad on large, powerful
reserved instances which often idle at ~800 MHz while Nomad does
its fingerprinting (typically IO bound), which Nomad then uses as
the max, which results in severe loss of available resources.

Since the CPU specification is unavailable programmatically (at least
not without sudo) use a best-effort lookup table. This table was
generated by going through every instance type in AWS documentation
and copy-pasting the numbers.
https://aws.amazon.com/ec2/instance-types/

This approach obviously is not ideal as future instance types will
need to be added as they are introduced to AWS. However, using the
table should only be an improvement over the status quo since right
now Nomad miscalculates available CPU resources on all instance types.
2020-04-28 19:01:33 -06:00
Mahmood Ali
2a81c12465 Merge pull request #7827 from hashicorp/deps-go-msgpack-v1.1.5
Harmonize go-msgpack/codec/codecgen
2020-04-28 18:13:09 -04:00
Mahmood Ali
1fd22623cd Harmonize go-msgpack/codec/codecgen
Use v1.1.5 of go-msgpack/codec/codecgen, so go-msgpack codecgen matches
the library version.

We branched off earlier to pick up
f51b518921
, but apparently that's not needed as we could customize the package via
`-c` argument.
2020-04-28 17:12:31 -04:00
Tim Gross
407e02c723 e2e: add helper to Makefile for local file deployments (#7822) 2020-04-28 16:15:58 -04:00
Lang Martin
1bcb8f5afb command: deployment status without a prefix lists deployments (#7821) 2020-04-28 15:11:32 -04:00
Mahmood Ali
67c1b93c87 Merge pull request #7818 from greut/codegen
structs: give codecgen import
2020-04-28 12:16:41 -04:00
Buck Doyle
edff4cc78c UI: update exec styles to match conventions (#7811) 2020-04-28 08:33:07 -05:00
Chris Baker
f8a690ebab Merge pull request #7816 from hashicorp/b-7789-job-scaling-status-issues
fix issues in Job.ScaleStatus
2020-04-28 06:33:42 -05:00
Yoan Blanc
f778a5be55 structs: give codecgen import
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-04-28 08:23:20 +02:00
Nick Ethier
31ddf77fdd nomad: build dynamic port for exposed checks if not specified (#7800) 2020-04-28 00:07:41 -04:00
Chris Baker
40e1db38e9 updated changelog 2020-04-27 21:46:56 +00:00
Chris Baker
d623b4bf96 modified Job.ScaleStatus to ignore deployments and look directly at the
allocations, ignoring canaries
2020-04-27 21:45:39 +00:00
Charlie Voiselle
738f3cb0ac Adding API homepage to sidebar. 2020-04-27 13:41:11 -04:00
Charlie Voiselle
943787a340 Merge pull request #7801 from hashicorp/d-fix-docker-credhelper-example
[docs] Update credential helper example in docker.mdx
2020-04-27 11:44:54 -04:00
Mahmood Ali
51b6ba8e70 Merge pull request #7809 from greut/typos
api: fix some documentation typos
2020-04-27 08:50:25 -04:00
Mahmood Ali
732ec7118a Merge pull request #7805 from hashicorp/vendor-go-metrics-v0.3.3
Vendor: update armon/go-metrics to v0.3.3
2020-04-27 08:49:50 -04:00
Yoan Blanc
26c1aebc69 api: fix some documentation typos
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-04-27 10:25:29 +02:00
Mahmood Ali
69a9d3c507 Vendor: update armon/go-metrics to v0.3.3
To pick up a lock contention fix in prometheus sink:
https://github.com/armon/go-metrics/pull/107 .
2020-04-26 08:54:50 -04:00
Charlie Voiselle
74d91fa2b4 Update docker.mdx 2020-04-24 23:20:02 -04:00
Charlie Voiselle
838b16d43d Merge pull request #7792 from angrycub/f-disable_dangling_container_gc
Disable dangling container GC for demo
2020-04-24 23:12:16 -04:00
Seth Hoenig
e958e4172d Merge pull request #7784 from hashicorp/demo-grpc-checks
demo: create a demo service for grpc healthchecks
2020-04-24 11:35:58 -06:00
Seth Hoenig
47dceed339 demo: create a demo service for grpc healthchecks
Examples for HTTP based task-group service healthchecks are
covered by the `countdash` demo, but gRPC checks currently
have no runnable examples.

This PR adds a trivial gRPC enabled application that provides
a Service implementing the standard gRPC healthcheck interface.
2020-04-24 10:59:50 -06:00
Tim Gross
8af65c5707 ci: add a linting check for HCL files (#7791)
Running `make dev` runs `hclfmt`, but this isn't checked as part of
CI. That makes it possible to merge un-formatted HCL and Nomad
jobspecs that later will make for dirty git staging areas when
developers pull master.

This changeset adds HCL linting to the `make check` target.
2020-04-23 14:32:44 -04:00
Charlie Voiselle
1de614c3cf Disable dangling container GC for demo 2020-04-23 11:51:03 -04:00
Tim Gross
22d4b88b69 csi: checkpoint volume claim garbage collection (#7782)
Adds a `CSIVolumeClaim` type to be tracked as current and past claims
on a volume. Allows for a client RPC failure during node or controller
detachment without having to keep the allocation around after the
first garbage collection eval.

This changeset lays groundwork for moving the actual detachment RPCs
into a volume watching loop outside the GC eval.
2020-04-23 11:06:23 -04:00
Tim Gross
34187582dc website: fix path for spellchecking and correct errors (#7790) 2020-04-23 10:38:08 -04:00
Chris Baker
3a5cfdac58 Merge pull request #7788 from hashicorp/b-7716-scaling-policy-parsing
parsing should error if scaling block includes multiple policy blocks
2020-04-23 08:57:31 -05:00
Chris Baker
1e52f77b4a changelog entries for 7772 and 7788 2020-04-23 12:45:52 +00:00
Chris Baker
392ec3e697 return parsing error if scaling policy includes more than one policy block
also, check that parsing a minimal scaling block doesn't throw any errors
2020-04-23 12:37:45 +00:00
Michael Lange
1b446bd3ef Merge pull request #7689 from hashicorp/ui/plumb-proxy-config-to-proxy
UI Plumb proxy config to proxy
2020-04-22 19:31:27 -07:00
Mahmood Ali
5973285cf0 Merge pull request #7785 from hashicorp/b-http-fail-log-level
http: adjust log level for request failure
2020-04-22 17:03:11 -04:00
Mahmood Ali
41bec868a8 http: adjust log level for request failure
Failed requests due to API client errors are to be marked as DEBUG.

The Error log level should be reserved to signal problems with the
cluster and are actionable for nomad system operators.  Logs due to
misbehaving API clients don't represent a system level problem and seem
spurius to nomad maintainers at best.  These log messages can also be
attack vectors for deniel of service attacks by filling servers disk
space with spurious log messages.
2020-04-22 16:19:59 -04:00
Mahmood Ali
cb0e58d7c2 Merge pull request #7780 from hashicorp/pre-0.11.2-dev-cycle
Pre 0.11.2 dev cycle
2020-04-22 15:28:34 -04:00
Mahmood Ali
4c151e0b4d prep for 0.11.2 dev cycle 2020-04-22 12:51:49 -04:00
Mahmood Ali
0e37d62fa4 prepare for 0.11.1 and reorder changelog 2020-04-22 12:50:29 -04:00
Mahmood Ali
699bc260ad Merge pull request #7779 from hashicorp/docs-website-0.11.1
update website for nomad 0.11.1
2020-04-22 12:15:28 -04:00
Mahmood Ali
be7908d4a1 update release to 0.11.1 2020-04-22 12:13:58 -04:00
Buck Doyle
77c7ed3de4 UI: Update ember-fetch to 6.7.2 (#7713)
This gets rid of this warning in the console:
Browserslist: caniuse-lite is outdated. Please run next command `yarn upgrade`
2020-04-22 09:10:55 -05:00
Chris Baker
27af1f2cd2 Merge pull request #7772 from hashicorp/b-7768-remove-policies-for-stopped-jobs
delete/create autoscaling policies as job is stopped/started
2020-04-22 08:15:55 -05:00
Buck Doyle
9630516b4f Remove superseded note
This closes #7465.
2020-04-21 19:52:45 -07:00
Michael Lange
1e4a3cf9a1 Disable the proxy when Mirage is enabled
This is to prevent max socket connection errors that can stop
the live reload server from responding.
2020-04-21 19:52:44 -07:00
Michael Lange
78f8dac531 Use existing ember proxy config within our custom proxy 2020-04-21 19:52:43 -07:00
Michael Lange
8808b2b475 Merge pull request #7685 from hashicorp/ui/upgrade-lint-staged
UI: Upgrade lint-staged and husky
2020-04-21 17:42:12 -07:00
Chris Baker
6f5610c9c4 modify state store so that autoscaling policies are deleted from their
table as job is stopped (and recreated when job is started)
2020-04-21 23:01:26 +00:00
Tim Gross
25944297bc changelog entries for 0.11.1 bugfixes (#7763) 2020-04-21 10:04:13 -04:00
Mahmood Ali
15b7474850 Merge pull request #7762 from hashicorp/b-in-place-update-deviceids
Perserve device ids in in-place alloc updates
2020-04-21 09:31:10 -04:00
Mahmood Ali
dacb634489 add changelog
[ci skip]
2020-04-21 09:27:40 -04:00
Mahmood Ali
04000c9ba9 Ensure that alloc updates preserve device offers
When an alloc is updated in-place, ensure that the allocated device are
preserved and carried over to new alloc.
2020-04-21 08:57:15 -04:00
Mahmood Ali
53dac68400 test for allocated devices on job in-update update
When an alloc is updated in-place, test that the allocated devices are
preserved in new alloc struct.
2020-04-21 08:56:05 -04:00
Buck Doyle
dd2e387074 Docs: correct search API (#7756)
This closes #7718. It corrects some inaccuracies and adds
an explanation of the truncations block.
2020-04-21 07:33:24 -05:00