Commit Graph

12662 Commits

Author SHA1 Message Date
Alex Dadgar
2be2650a75 Diff 2018-10-08 17:02:58 -07:00
Alex Dadgar
e47ddbdd9f parse devices 2018-10-08 16:09:41 -07:00
Alex Dadgar
8b9b77dd59 Define device request structs 2018-10-08 15:38:03 -07:00
Alex Dadgar
6e369c653c Merge pull request #4750 from hashicorp/f-allocated-resources
Split Resource struct
2018-10-08 14:50:40 -07:00
Chris Baker
430388b361 Merge pull request #4743 from hashicorp/doc-rpc-port-discussion
docs: make explicit the communication pattern on RPC port (4647)
2018-10-08 16:01:41 -04:00
Chris Baker
d9c31e8950 Merge pull request #4755 from hashicorp/f-update-go-version-to-1.11
Vagrant: Update go version to 1.11
2018-10-08 15:33:24 -04:00
Alex Dadgar
a027e5cbd4 Fix example drain API request 2018-10-08 10:06:39 -07:00
Alex Dadgar
7972221db9 nvidia package restructue + build non-linux 2018-10-05 13:56:04 -07:00
Chris Baker
0ce5a020c6 renamed vagrant script to accurately reflect non-privileged requirement 2018-10-05 10:07:05 -04:00
Chris Baker
bde95d81c3 vagrant: updated go_version to 1.11 in vagrant-linux go provisioning script 2018-10-04 19:06:35 -04:00
Chris Baker
dcf5c83405 vagrant: modified UI provisioning script to run as non-privileged 2018-10-04 19:06:35 -04:00
Omar Khawaja
3817ce3dd5 editing monitoring.html (#4754) 2018-10-04 18:40:13 -04:00
Omar Khawaja
d836fa06cf editing lb guide (#4753) 2018-10-04 18:26:51 -04:00
Alex Dadgar
343e06c60f Merge pull request #4638 from oleksii-shyman/nvidia-plugin
WIP :: Nvidia Plugin
2018-10-04 15:24:36 -07:00
Alex Dadgar
e30b20e65e renames 2018-10-04 14:57:25 -07:00
oleksii.shyman
a7e04f1520 Introduce nvidia-plugin reserve
- added reserve functionality that returns OCI compliant env variables
  specifying GPU IDs to be injected inside the container
2018-10-04 14:55:34 -07:00
Alex Dadgar
0f2f4797cb fixing tests 2018-10-04 14:26:19 -07:00
Omar Khawaja
b8c3a1d02d Monitoring and Alerting Guide with Prometheus [WIP] (#4706)
* add prometheus configuration guide

* fixing sub navigation issue

* Add detail to Next Steps

* add alerting component to guide

* update

* change docker image name and shorten job templates

* re-arrange to fix broken links
2018-10-04 17:15:10 -04:00
Omar Khawaja
dd3601979e Load Balancing with Fabio Guide (#4445)
* add load-balancing guide

* restructure load balancing section

* defining consul lb strategies inline and giving fabio its own bullet point

* update docker image name and shorten job template

* changing system scheduler link to relative link and moving load balancing navigation link right to right above Web UI
2018-10-04 16:18:52 -04:00
oleksii.shyman
9c8c67e948 Introduce Nvidia-plugin stats
- created go-nvml wrapper for stats
 - added stats feature to nvidia-plugin
2018-10-03 15:12:05 -07:00
oleksii.shyman
63f4fbf273 Introduce nvidia-plugin fingerprinting
- created go-nvml wrapper for fingerprinting
  - added fingerprinting feature to nvidia-plugin
2018-10-03 15:11:56 -07:00
Alex Dadgar
49c2d4f775 Scheduler uses allocated resources 2018-10-02 17:08:25 -07:00
Chris Baker
738cf3ad62 docs: amended description per @dadgar suggestions in https://github.com/hashicorp/nomad/pull/4743 2018-10-02 13:02:56 -04:00
Chris Baker
4448d4951a docs: make explicit the communication pattern on RPC port (4647) 2018-10-02 12:19:37 -04:00
Alex Dadgar
018819e5cf allocated resources structs 2018-09-29 18:47:28 -07:00
Alex Dadgar
f969298854 Node reserved resources 2018-09-29 18:44:55 -07:00
Alex Dadgar
b310a54aa6 Node resources on client 2018-09-29 17:23:41 -07:00
Alex Dadgar
95d9286ad1 changelog 2018-09-26 14:53:15 -07:00
Alex Dadgar
025c5d4455 Merge pull request #4723 from hashicorp/b-autopilot-cli
Fix autopilot set enable custom upgrades flag
2018-09-25 13:53:52 -07:00
Alex Dadgar
9688161a54 Fix autopilot set enable custom upgrades flag 2018-09-25 13:49:35 -07:00
Alex Dadgar
589e67202b Merge pull request #4720 from hashicorp/b-jet-fixes
Series of scheduler fixes / debugging enhancements
2018-09-25 13:25:11 -07:00
Alex Dadgar
088f51a330 skip e2e/vault if integration isn't set 2018-09-25 11:29:09 -07:00
Alex Dadgar
bcb1a67015 Merge pull request #4712 from hashicorp/b-failed-trigger-reason
Add a missing eval trigger reason
2018-09-25 10:50:16 -07:00
Alex Dadgar
b3e85557f0 fix logging 2018-09-25 10:49:55 -07:00
Preetha Appan
47e22f6b7c Add failed follow up to the list of allowed eval trigger reasons
needs unit test
2018-09-25 10:49:55 -07:00
Preetha Appan
e9c7dc1286 Added logging around nacked evals in the scheduler worker 2018-09-25 10:49:02 -07:00
Alex Dadgar
454c1d0e84 Merge pull request #4717 from barda999/master
changed ${nomad.class} to ${node.class}
2018-09-24 16:51:27 -07:00
barda999
c09cb9f08d changed ${nomad.class} to ${node.class}
I guess that was an unintentional mistake
2018-09-24 16:48:06 -07:00
Alex Dadgar
086b1266c6 Merge pull request #4698 from hashicorp/t-vault-matrix
Vault test matrix
2018-09-24 16:34:35 -07:00
Alex Dadgar
668a90102f proper variable capture 2018-09-24 16:34:15 -07:00
Alex Dadgar
029a7f617e Merge pull request #4716 from hashicorp/f-no-reuse-triggerby
Unique TriggerBy for blocked evals
2018-09-24 16:08:31 -07:00
Alex Dadgar
302a6940af Merge branch 'b-plan' into b-jet-fixes 2018-09-24 16:07:29 -07:00
Alex Dadgar
b8ec297263 Merge pull request #4709 from hashicorp/b-deployments
Fix deployment watcher index usage
2018-09-24 16:05:02 -07:00
Alex Dadgar
ed53038e04 Unique TriggerBy for blocked evals
Give blocked evals a unique triggerby reason to make debugging a chain
of evaluations easier.
2018-09-24 14:47:49 -07:00
Alex Dadgar
4c40d62f68 test allocs fit 2018-09-24 13:59:01 -07:00
Alex Dadgar
06920ee46c Better comment on snapshotindex 2018-09-24 13:53:43 -07:00
Alex Dadgar
82889c432e Denormalize jobs in plan and ignore resources of terminal allocs
Denormalize jobs in AppendAllocs:
AppendAlloc was originally only ever called for inplace upgrades and new
allocations. Both these code paths would remove the job from the
allocation. Now we use this to also add fields such as FollowupEvalID
which did not normalize the job. This is only a performance enhancement.

Ignore terminal allocs:
Failed allocations are annotated with the followup Eval ID when one is
created to replace the failed allocation. However, in the plan applier,
when we check if allocations fit, these terminal allocations were not
filtered. This could result in the plan being rejected if the node would
be overcommited if the terminal allocations resources were considered.
2018-09-24 13:53:43 -07:00
Alex Dadgar
9d4ff89eaf Fix other instances of blocking queries 2018-09-24 13:52:39 -07:00
Preetha Appan
1a9c18f9df update changelog 2018-09-24 11:19:51 -05:00
Preetha
21f7198835 Merge pull request #4702 from hashicorp/b-non-voter-boostrap
Do not bootstrap with non voters
2018-09-24 11:14:36 -05:00