Michael Schurter
c1cf162d83
Merge pull request #3105 from hashicorp/f-876-restart-unhealthy
...
Restart unhealthy tasks
2017-09-17 19:38:32 -07:00
epipho
56d591b119
Fix incorrect docker stats
2017-09-16 00:43:03 -04:00
Michael Schurter
fa836d8ca5
Name const after what it represents
2017-09-15 14:57:18 -07:00
Michael Schurter
cde908e35d
Cleanup and test restart failure code
2017-09-15 14:54:37 -07:00
Michael Schurter
80147622c5
Add comments
2017-09-15 14:34:36 -07:00
Michael Schurter
a508bb9709
Fold SetFailure into SetRestartTriggered
2017-09-14 16:48:39 -07:00
Michael Schurter
10dc1c7900
DRY up restart handling a bit.
...
All 3 error/failure cases share restart logic, but 2 of them have
special cased conditions.
2017-09-14 16:48:39 -07:00
Michael Schurter
f8e872c855
RestartDelay isn't needed as checks are re-added on restarts
...
@dadgar made the excellent observation in #3105 that TaskRunner removes
and re-registers checks on restarts. This means checkWatcher doesn't
need to do *any* internal restart tracking. Individual checks can just
remove themselves and be re-added when the task restarts.
2017-09-14 16:48:39 -07:00
Michael Schurter
568b963270
Remove unused lastStart field
2017-09-14 16:47:41 -07:00
Michael Schurter
526528c7c9
Removed partially implemented allocLock
2017-09-14 16:47:41 -07:00
Michael Schurter
3db835cb8f
Improve check watcher logging and add tests
...
Also expose a mock Consul Agent to allow testing ServiceClient and
checkWatcher from TaskRunner without actually talking to a real Consul.
2017-09-14 16:47:41 -07:00
Michael Schurter
c2d895d47a
Add comments and move delay calc to TaskRunner
2017-09-14 16:46:54 -07:00
Michael Schurter
ebbf87f979
Use existing restart policy infrastructure
2017-09-14 16:46:54 -07:00
Michael Schurter
1608e59415
Add check watcher for restarting unhealthy tasks
2017-09-14 16:46:54 -07:00
Alex Dadgar
98c47c72d0
changelog and feedback
2017-09-14 14:08:58 -07:00
Alex Dadgar
f23ac5f083
Non-locked accessors to common Node fields
...
This PR removes locking around commonly accessed node attributes that do
not need to be locked. The locking could cause nodes to TTL as the
heartbeat code path was acquiring a lock that could be held for an
excessively long time. An example of this is when Vault is inaccessible,
since the fingerprint is run with a lock held but the Vault
fingerprinter makes the API calls with a large timeout.
Fixes https://github.com/hashicorp/nomad/issues/2689
2017-09-14 14:08:26 -07:00
Chelsea Komlo
0fff99488f
Merge pull request #3191 from hashicorp/b-tagged-metrics-panic
...
Fix panic in emitting tagged allocation metrics
2017-09-11 14:28:50 -04:00
Armon Dadgar
c038c410b3
Merge pull request #3185 from hashicorp/f-acl-reset
...
Add ability to reset ACL bootstrap process
2017-09-11 10:47:17 -07:00
Armon Dadgar
3fc9dce13b
Address @dadgar feedback
2017-09-11 10:30:59 -07:00
Alex Dadgar
1bed4b41d9
Merge pull request #3187 from hashicorp/b-windows-docker
...
Fix MemorySwappiness on Windows Docker
2017-09-11 09:56:49 -07:00
Alex Dadgar
49c4189758
Merge pull request #3184 from hashicorp/b-docker-logging
...
Fix docker user specified syslogging
2017-09-11 09:31:33 -07:00
Chelsea Holland Komlo
1ecfb687bf
fix panic in emitting tagged metrics
2017-09-11 15:32:37 +00:00
Alex Dadgar
7148b65306
Fix MemorySwappiness on Windows Docker
...
Fixes https://github.com/hashicorp/nomad/issues/3181
2017-09-10 17:46:45 -07:00
Alex Dadgar
db261cd0c7
Fix invalid CPU stats on Windows
...
This PR fixes an issue introduced in Nomad 0.6.0 due to
https://github.com/shirou/gopsutil/issues/420 . The issue arised from the
fact that the Windows stats from gopsutil reports CPUs in
percentages where we expected ticks.
2017-09-10 15:30:48 -07:00
Alex Dadgar
9206105e69
Fix docker user specified syslogging
2017-09-10 14:57:48 -07:00
James Nugent
3a5082022d
client: Guard against "NaN" values from floats
...
This commit protects against finding `0.NaN` tokens in JSON streams
because of infinity representation on serialization.
2017-09-08 16:21:07 -05:00
Alex Dadgar
d8d4fb877b
Merge pull request #3148 from clinta/purge-stopped
...
Always purge stopped containers
2017-09-05 17:18:05 -07:00
Alex Dadgar
4cab1781f6
Fix repo name passed to docker credential helpers
...
This PR fixes the server url passed to docker credential helpers and
fixes stderr capture.
Fixes https://github.com/hashicorp/nomad/issues/2957
2017-09-05 16:43:21 -07:00
Alex Dadgar
b9f51ce61c
Parse Docker mounts correctly ( #3163 )
...
* Parse Docker mounts correctly
This PR fixes the parsing of Docker mounts and adds testing to ensure no
regressions.
Fixes https://github.com/hashicorp/nomad/issues/3156
* Review feedback
2017-09-05 14:02:57 -07:00
Chelsea Holland Komlo
68686cd69a
final code review fixups
2017-09-05 18:47:44 +00:00
Chelsea Holland Komlo
ef4aef0223
fix up travis test failure via race condition
2017-09-05 15:04:59 +00:00
Chelsea Holland Komlo
681a3f337a
fixups from code review
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
3c0710074c
labels depend on full setup of client beforehand
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
fce72a1bc9
refactor to use baseLabels
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
a6eeede7e2
pass in commonly used values
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
50ab667799
create base labels to be used in every metric
2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
7a96f92290
emit metrics using labels, add option for backwards compatibility
2017-09-05 14:12:57 +00:00
Chelsea Holland Komlo
5efb4fb214
add metrics options to client config
2017-09-05 14:12:57 +00:00
Armon Dadgar
e9790c63b4
ACL RPCs allow stale reads for scalability
2017-09-04 13:07:44 -07:00
Armon Dadgar
33f640dc38
client: fixing policy resolution after ACL endpoint enforcement
2017-09-04 13:05:53 -07:00
Armon Dadgar
0fcf618dfc
Add ErrPermissionDenied, rename TokenNotFound
2017-09-04 13:05:53 -07:00
Armon Dadgar
bda7b36da3
Address @dadgar feedback
2017-09-04 13:05:53 -07:00
Armon Dadgar
5b43ea4bff
client: adding token resolution logic
2017-09-04 13:05:36 -07:00
Armon Dadgar
fb118b2dfb
client: adding token cache for ACL resolution
2017-09-04 13:05:36 -07:00
Armon Dadgar
1da443f29a
client: create ACL and Policy cache
2017-09-04 13:05:35 -07:00
Armon Dadgar
0dc6a1a9c7
agent: thread ACL config to client
2017-09-04 13:04:45 -07:00
Clint Armstrong
786c09f7e4
Always purge stopped containers
2017-08-31 14:28:48 -04:00
Clint Armstrong
cfffce07a0
fix logging re-init
2017-08-30 12:36:31 -04:00
Michael Schurter
7a84cbe02a
Squelch logspam when unable to get disk usage stats
...
To reproduce logspam:
```
$ docker plugin install --grant-all-permissions vieux/sshfs
$ nomad agent -dev
...
2017/08/25 17:09:03.282868 [WARN] client: error fetching host disk usage stats for /var/lib/docker/plugins/a8b4a69b07e5180f828d19e1e9e102ccc0e26f9c9939eaef85357260c30b20a7/rootfs/mnt/volumes: permission denied
... repeats every collection period ...
```
2017-08-28 12:04:32 -07:00
Alex Dadgar
ba1eecbf7f
Merge pull request #3073 from clinta/docker-500
...
Allow retry of 500 API errors to be handled by restart policies
2017-08-24 16:57:36 -07:00