Chelsea Holland Komlo
c6cd78db59
only initialize docker clients if they are nil
2018-04-09 14:13:07 -04:00
Chelsea Holland Komlo
4c1c88a91c
refacotoring simplification from code review
2018-04-09 10:34:17 -04:00
Chelsea Holland Komlo
d251199432
group similar functions; update comments
...
health check timeout should be 1 minute
2018-04-05 16:19:02 -04:00
Chelsea Holland Komlo
dee4fc4555
remove do once block when creating a new docker client
...
only set cached connections upon no error
2018-04-05 16:19:02 -04:00
Chelsea Holland Komlo
45d09d1ef9
use client with shorter timeouts for health checks
2018-04-05 16:19:02 -04:00
Chelsea Holland Komlo
9092439107
refactor docker clients method to be able to extend to creating new clients
2018-04-05 16:19:02 -04:00
Preetha
ff006877de
Merge pull request #4101 from hashicorp/b-rescheduling-edge-fixes
...
Fixes edge cases around timing/ task finish time being set more than once
2018-04-04 16:18:21 -05:00
Preetha Appan
e81886d588
remove outdated commented out test code
2018-04-04 15:03:24 -05:00
Preetha Appan
8b6143f272
Remove old comment
2018-04-04 15:01:48 -05:00
Preetha Appan
7fa7655ebe
Moves setting finishedAt to the right place and adds two unit tests.
2018-04-04 14:38:15 -05:00
Preetha Appan
d8e975510a
Add comment
2018-04-03 19:49:03 -05:00
Preetha Appan
aa4a0cff50
Fixes edge cases around timing and task finish time being set more than once
2018-04-03 16:34:59 -05:00
Alex Dadgar
16ec4481e3
Improve Vault error handling
2018-04-03 14:29:22 -07:00
Alex Dadgar
1a66631eff
remove generated files
2018-03-30 16:52:49 -07:00
Alex Dadgar
702a3be41e
Generated files
2018-03-30 16:14:40 -07:00
Michael Schurter
2ee0426985
test: don't rely on alloc runner update count
...
We were incorrectly relying on the count of alloc updates in a number of
tests. Since alloc updates are async, their number is non-determinstic
and largely meaningless.
This should fix quite a few flaky tests in Travis and prevent future
mistaken assumptions in tests.
2018-03-30 09:34:33 -07:00
Michael Schurter
53a504c69c
Merge pull request #4069 from hashicorp/f-hashealth
...
add HasHealth helper for nil checks
2018-03-29 17:03:20 -07:00
Alex Dadgar
357a10bcf4
Always capture the finish time
2018-03-29 11:27:22 -07:00
Michael Schurter
d09b0b62ba
add HasHealth helper for nil checks
...
We performed the DeploymentStatus nil checks a couple different ways, so
hopefully this helper will consoldiate them and make it more clear what
the code is doing.
2018-03-29 09:29:19 -07:00
Chelsea Komlo
00b358553d
Merge pull request #4065 from hashicorp/emit-node-event-on-first-health-change
...
Emit first node event after initialization on health status change
2018-03-29 11:23:25 -04:00
Chelsea Holland Komlo
aeb744d930
add clarifying comment
2018-03-29 10:58:39 -04:00
Michael Schurter
35f42b1fca
Merge pull request #4059 from hashicorp/b-drain-health-svc-only
...
only service allocs should have health watched
2018-03-28 16:49:22 -07:00
Michael Schurter
12dd17affe
only service allocs should have health watched
2018-03-28 16:20:11 -07:00
Chelsea Holland Komlo
dff03f6a91
emit first node event
2018-03-28 17:26:53 -04:00
Chelsea Komlo
b26031b90d
Merge pull request #4057 from hashicorp/specify-docker-msg
...
Specify docker name in driver health messages
2018-03-28 13:32:36 -04:00
Preetha
6f870b8bd7
Merge pull request #4052 from hashicorp/f-specify-total-memory
...
Allow to specify total memory on agent configuration
2018-03-28 12:28:41 -05:00
Chelsea Holland Komlo
cdfeac13a1
specify driver health messages
2018-03-28 11:35:21 -04:00
Preetha Appan
30d104251b
Code review feedback and unit test
2018-03-28 10:07:15 -05:00
Charlie Voiselle
33e57bf5a3
rkt: logging enhancements ( #4044 )
...
* Added extra debug logging; extended timeout; added jitter.
* small log changes
* increase timeout
* remove unneccessary uuid
2018-03-27 17:30:06 -07:00
Michael Schurter
079f425c32
client: always mark exited sys/svc allocs as failed
...
When restarts.attempts=0 was set in a jobspec a system or service alloc
that exited with 0 status would be marked as `completed` instead of
`failed`. Since system and service jobs are intended to run until
stopped or updated, they should always be marked as failed when they
exit even in cases where the exit code is 0.
2018-03-27 14:30:19 -07:00
Mildred Ki'Lya
d31105c69e
Allow to specify total memory on agent configuration
...
Allow to set the total memory of an agent in its configuration file. This
can be used in case the automatic detection doesn't work or in specific
environments when memory overcommit (using swap for example) can be
desirable.
2018-03-27 15:46:18 -05:00
Chelsea Holland Komlo
041786360e
use time.Time for node events for compatibility
2018-03-27 15:43:57 -04:00
Alex Dadgar
d10e155e0f
Fix alloc watcher snapshot streaming
2018-03-27 11:14:53 -07:00
Alex Dadgar
31b317b6ee
drop stats fetching log
2018-03-23 12:01:50 -07:00
Chelsea Komlo
9f74c6a378
Merge pull request #4030 from hashicorp/health-check-ux
...
UX improvments to driver health checks
2018-03-23 09:46:50 -04:00
Alex Dadgar
95a7e1a90a
Driver Info output
2018-03-22 17:18:32 -07:00
Chelsea Holland Komlo
bf3b7d8588
ux improvments to driver health checks
2018-03-22 18:38:29 -04:00
Michael Schurter
3a7a3f32d5
Merge pull request #4022 from hashicorp/f-more-executor-logging
...
executor: increase level for helpful log lines
2018-03-22 15:21:20 -07:00
Michael Schurter
b58a22c2e9
remove spurious TODOs and FIXMEs
2018-03-21 16:55:22 -07:00
Michael Schurter
50a94d73c9
test: try to prevent flakiness on travis
2018-03-21 16:51:45 -07:00
Michael Schurter
1537061ebc
alloc_runner: watch health for deployed batch jobs
2018-03-21 16:51:45 -07:00
Michael Schurter
3ca9cdfadc
client: don't monitor health of non-service jobs
...
Also fix system job draining; won't work without deadline fixes
2018-03-21 16:51:44 -07:00
Alex Dadgar
3fe3c6eff7
Improve DeadlineTime helper
2018-03-21 16:51:44 -07:00
Alex Dadgar
48d637dad1
RPC, FSM, State Store for marking DesiredTransistion
...
fix build tag
2018-03-21 16:49:48 -07:00
Michael Schurter
91e8fd098f
mock_driver: improve Kill() logging
2018-03-21 16:49:48 -07:00
Michael Schurter
95b3b6eb02
drain: initial drainv2 structs and impl
2018-03-21 16:49:48 -07:00
Chelsea Holland Komlo
eb3a53efa2
always set initial health status for every driver
2018-03-21 15:15:26 -04:00
Chelsea Holland Komlo
127b2c6ef7
set driver to unhealthy once if it cannot be detected in periodic check
2018-03-21 15:15:26 -04:00
Alex Dadgar
b59bea98b0
Docker driver doesn't return errors but injects into the DriverInfo
2018-03-21 15:15:26 -04:00
Alex Dadgar
ffe9292e24
Only run health check if driver is detected
2018-03-21 15:15:26 -04:00