Commit Graph

13206 Commits

Author SHA1 Message Date
Mahmood Ali
4f18086d35 Display device stats in nomad alloc status 2018-11-16 10:26:32 -05:00
Mahmood Ali
6208d53275 Prepare to reuse device resources printing 2018-11-16 10:26:32 -05:00
Mahmood Ali
58cbafe913 Populate alloc stats API with device stats
This change makes few compromises:

* Looks up the devices associated with tasks at look up time.  Given
that `nomad alloc status` is called rarely generally (compared to stats
telemetry and general job reporting), it seems fine.  However, the
lookup overhead grows bounded by number of `tasks x total-host-devices`,
which can be significant.

* `client.Client` performs the task devices->statistics lookup.  It
passes self to alloc/task runners so they can look up the device statistics
allocated to them.
  * Currently alloc/task runners are responsible for constructing the
entire RPC response for stats
  * The alternatives for making task runners device statistics aware
don't seem appealing (e.g. having task runners contain reference to hostStats)

* On the alloc aggregation resource usage, I did a naive merging of task device statistics.
  * Personally, I question the value of such aggregation, compared to
costs of struct duplication and bloating the response - but opted to be
consistent in the API.
  * With naive concatination, device instances from a single device group used by separate tasks in the alloc, would be aggregated in two separate device group statistics.
2018-11-16 10:26:32 -05:00
Preetha
fff2d6f4d8 Merge pull request #4882 from sportebois/f-update-docs-env-stanza-coercion
Change misleading boolean example in env stanza coercion section (#4820)
2018-11-16 08:00:56 -06:00
Michael Schurter
ca4e237d62 gofmt -s -w command/helper_stats_test.go
Fixes the static checks build
2018-11-15 14:14:05 -08:00
Danielle Tomlinson
60c6cb8586 Merge pull request #4875 from hashicorp/f-constraints
scheduler: Make != constraints more flexible
2018-11-15 11:04:21 -08:00
Mahmood Ali
6542d6de75 Merge pull request #4879 from hashicorp/f-node-devices-cli
Show device stats in nomad CLI
2018-11-15 14:02:22 -05:00
Danielle Tomlinson
5aac3047e0 Add changelog entry for != operator 2018-11-15 11:00:32 -08:00
Danielle Tomlinson
cf8bc3224d docs: Add is_set/is_not_set 2018-11-15 11:00:32 -08:00
Danielle Tomlinson
3d0a45f6e5 scheduler: Add is_set/is_not_set constraints
This adds constraints for asserting that a given attribute or value
exists, or does not exist. This acts as a companion to =, or !=
operators, e.g:

```hcl
constraint {
        attribute = "${attrs.type}"
        operator  = "!="
        value     = "database"
}

constraint {
        attribute = "${attrs.type}"
        operator  = "is_set"
}
```
2018-11-15 11:00:32 -08:00
Sébastien Portebois
a71dea2ac9 Change misleading boolean example in env stanza coercion section 2018-11-15 13:12:58 -05:00
Mahmood Ali
5e2cdb48d4 Display StatsObject nested objects as well 2018-11-15 08:09:54 -05:00
Mahmood Ali
8274555d1c Use disk display format for devices 2018-11-14 22:13:23 -05:00
Mahmood Ali
85f5007043 Print verbose device in nomad node status -stats 2018-11-14 22:13:23 -05:00
Mahmood Ali
276149eaa7 device stats summary in node status
Sample output with a mock device:

```
Host Resource Utilization
CPU             Memory          Disk
2651/26400 MHz  9.6 GiB/16 GiB  98 GiB/234 GiB

Device Resource Utilization
nomad/file/mock[README.md]    511 bytes
nomad/file/mock[e2e.go]       239 bytes
nomad/file/mock[e2e_test.go]  128 bytes

Allocations
No allocations placed
```
2018-11-14 22:13:23 -05:00
Mahmood Ali
0ed1ab65b4 Merge pull request #4872 from hashicorp/f-devices-client-stats
Expose Device Stats in Clients API
2018-11-14 21:18:23 -05:00
Mahmood Ali
e102b3751f format vendor.json 2018-11-14 20:17:11 -05:00
Alex Dadgar
ab39343b91 Merge pull request #4874 from lunchbag/jen/update-share-img
Update open graph image
2018-11-14 14:36:44 -08:00
Mahmood Ali
c212716dda Add NodeResource Device types in api package 2018-11-14 14:42:36 -05:00
Mahmood Ali
04ecb5c72a Track Node Device attributes and serve them in API 2018-11-14 14:42:29 -05:00
Mahmood Ali
ba3fe15f7e Add Client Device Stats structs in api package 2018-11-14 14:41:19 -05:00
Mahmood Ali
5af9296bb4 Expose Device Stats in /client/stats API endpoint 2018-11-14 14:41:19 -05:00
Mahmood Ali
dd47c590f0 Allow nullable fields in StatValues
In state values, we need to be able to distinguish between zero values
(e.g. `false`) and unset values (e.g. `nil`).

We can alternatively use protobuf `oneOf` and nested map to ensure
consistency of fields that are set together, but the golang
representation does not represent that well and introducing a mismatch
between representations.  Thus, I opted not to use it.
2018-11-14 14:41:19 -05:00
Mahmood Ali
2f4c510cb7 Move Stat{Object|Value} to plugins/shared/structs
Moving them as they may be useful for other packages/plugins besides
devices.
2018-11-14 09:01:26 -05:00
Mahmood Ali
df694eb3be Regenerate proto files with protoc-gen-go@v1.2.0 2018-11-14 09:01:26 -05:00
Mahmood Ali
5697cbb183 fix comment typos 2018-11-14 08:36:14 -05:00
Omar Khawaja
3ad663be60 AWS sandbox environment upgrade (#4873)
* upgrade Nomad from 0.8.4 to 0.8.6

* update deprecated nomad and vault commands

* update AMI ID

* add ingress rule for default fabio port and fabio UI

* upgrade Consul and Vault versions

* update AMI ID in README.md and terraform.tfvars
2018-11-13 23:21:01 -05:00
Danielle Tomlinson
f16d96bdd8 Merge pull request #4869 from hashicorp/b-executor-stdout
executor: Fix stdout stderr copy/paste
2018-11-13 19:22:37 -08:00
Jen
14dab7a67f Update open graph image 2018-11-13 17:36:13 -05:00
Danielle Tomlinson
0925bfe618 scheduler: Allow comparisons of nil values
This commit allows the ConstraintChecker to test values that do not exist.
This is useful when wanting to _exclude_ given nodes from executing a
job, for example, if you wanted to give canary nodes an attribute, and
not run critical services on them, you may specify something like the
below, but not want to tag all other nodes with the inverse.

```hcl
constraint {
  attribute = "${node.attr.canary}
  operator = "!="
  value = "1"
}
```

This also requires all constraint checkers to allow for nil target
values, as they will no longer be short circuited by resolving a target.
2018-11-13 13:36:51 -08:00
Mahmood Ali
851b275afc Merge pull request #4858 from hashicorp/b-fix-master-20181109
Fix some tests in master
2018-11-13 16:08:26 -05:00
Alex Dadgar
895fdb79f1 Merge pull request #4867 from hashicorp/b-deployment-progress-deadline
Blocked evaluation fixes
2018-11-13 10:29:03 -08:00
Alex Dadgar
6d0cd01a6a Merge pull request #4868 from hashicorp/b-plugin-ctx
Plugin client's handle plugin dying
2018-11-13 10:26:53 -08:00
Mahmood Ali
d575e1df29 Ignore apt-get update failures in CI
We run with ~120 apt sources, and apt-get update fails if any of them is
down.

True errors would be raised again at install phase as true dependencies
fetch would fail.
;
2018-11-13 10:21:40 -05:00
Mahmood Ali
9d6a362b94 Use materialized duration fields for driver config 2018-11-13 10:21:40 -05:00
Mahmood Ali
5c906aa085 convert all config durations to strings in tests 2018-11-13 10:21:40 -05:00
Mahmood Ali
9254bdb92e Avoid downloading image if present locally 2018-11-13 10:21:40 -05:00
Mahmood Ali
179cdc6277 Address review comments 2018-11-13 10:21:40 -05:00
Mahmood Ali
5fe433efe7 avoid setting resource limit on rkt command
Was accidentally modified in 5b14d24bf4 .
2018-11-13 10:21:40 -05:00
Mahmood Ali
572a1a205b fix plugin test 2018-11-13 10:21:40 -05:00
Mahmood Ali
7ca535aa90 increase timeout to 30 minutes
nomad/client take very long and exceed 15m sometimes:

In https://travis-ci.org/hashicorp/nomad/jobs/452990197 :

```
panic: test timed out after 15m0s

goroutine 4739 [running]:
testing.(*M).startAlarm.func1()
	/home/travis/.gimme/versions/go1.11.2.linux.amd64/src/testing/testing.go:1296 +0xfd
....
goroutine 4665 [select]:
github.com/hashicorp/nomad/vendor/google.golang.org/grpc.newClientStream.func5(0xc0003dd500, 0xc000420120, 0x2b3f86295588, 0xc000496810)
	/home/travis/gopath/src/github.com/hashicorp/nomad/vendor/google.golang.org/grpc/stream.go:287 +0xd7
created by github.com/hashicorp/nomad/vendor/google.golang.org/grpc.newClientStream
	/home/travis/gopath/src/github.com/hashicorp/nomad/vendor/google.golang.org/grpc/stream.go:286 +0x842
FAIL	github.com/hashicorp/nomad/client/driver	900.036s
```
2018-11-13 10:21:40 -05:00
Mahmood Ali
9933f4a45c Fix docker log fetching in tests
We no longer use syslog for tracking logs so tracking them explicitly
here
2018-11-13 10:21:40 -05:00
Mahmood Ali
9d8a71dc44 killing should be done with wait client
Incidentally changed in 5b14d24bf4
2018-11-13 10:21:40 -05:00
Mahmood Ali
6b8c6836a9 Prioritize checking consumer context cancellation
Tests expect that as soon as eventer shuts down immediately on context
cancellations; but golang does not guarantee priority when multiple
pending channels are ready in a select statement.
2018-11-13 10:21:40 -05:00
Mahmood Ali
f9295631c4 Set clean config for mock driver
The default job here contains some exec task config (for setting
command and args) that aren't used for mock driver.  Now, the alloc
runner seems stricter about validating fields and errors on unexpected
fields.

Updating configs in tests so we can have an explicit task config
whenever driver is set explicitly.
2018-11-13 10:21:40 -05:00
Mahmood Ali
2357e886ce mark and skip failing consul failing tests 2018-11-13 10:21:40 -05:00
Mahmood Ali
73077e36fe Update Docker name parsing lookup
`ParseNamed` function changed in e9f3f2cfee
where became `ParsedNormalizedName` with extra checks.
2018-11-13 10:21:40 -05:00
Mahmood Ali
8ccb80bcea pull alpine image needed for test
The test requires the image to be present locally, so importing it as
part of setup.
2018-11-13 10:21:40 -05:00
Mahmood Ali
7049b43471 Adjust streaming duration
This test expects 11 repeats of the same message emitted at intervals of
200ms; so we need more than 2 seconds to adjust for time sleep
variations and the like.  So raising it to 3s here that should be
enough.
2018-11-13 10:21:40 -05:00
Mahmood Ali
416b5240f4 Handle time.Duration in mock
Mock driver config uses `time.Duration` fields but we initialize them
inconsistently, as time.Duration sometimes and as duration strings other
times.  Previously, `mapstructure` handles it and does the right thing.

This is no longer the case with MsgPack.  I could not find a good way to
bring back old behavior without too much complexity.  `MsgPack` extended
types weren't ideal here as we lose type information (e.g. int64 vs
string), and the input is a generic map and not a MsgPack serialization
of duration.

As such, I went with the simple solution of declaring the config field
as duration string, and panicing if the test doesn't pass a valid
string.

I found this to cause the smallest change in tests, but we can
alternatively force all to be int64 instead.
2018-11-13 10:21:40 -05:00