Commit Graph

374 Commits

Author SHA1 Message Date
Michael Schurter
ec43315e13 Fix regression by returning error on unknown alloc 2017-11-01 15:16:38 -05:00
Michael Schurter
fb3a780b7a Trigger GCs after alloc changes
GC much more aggressively by triggering GCs when allocations become
terminal as well as after new allocations are added.
2017-11-01 15:16:38 -05:00
Michael Schurter
9c1e595e2e Fix GC'd alloc tracking
The Client.allocs map now contains all AllocRunners again, not just
un-GC'd AllocRunners. Client.allocs is only pruned when the server GCs
allocs.

Also stops logging "marked for GC" twice.
2017-11-01 15:16:38 -05:00
Alex Dadgar
05bb446323 Node access is done using locked Node copy
Fixes https://github.com/hashicorp/nomad/issues/3454

Reliably reproduced the data race before by having a fingerprinter
change the nodes attributes every millisecond and syncing at the same
rate. With fix, did not ever panic.
2017-10-27 13:27:24 -07:00
Michael Schurter
0d535aea95 base64 migrate token
HTTP header values must be ASCII.

Also constant time compare tokens and test the generate and compare
helper functions.
2017-10-13 10:59:13 -07:00
Chelsea Holland Komlo
76b2c50dbc fix up build warnings 2017-10-11 17:11:57 -07:00
Chelsea Holland Komlo
2368068355 fixing up code review comments 2017-10-11 17:09:20 -07:00
Chelsea Holland Komlo
fba1653057 Add functionality for authenticated volumes 2017-10-11 17:09:20 -07:00
Michael Schurter
04b8f8e7fc Remove structs import from api
Goes a step further and removes structs import from api's tests as well
by moving GenerateUUID to its own package.
2017-09-29 10:36:08 -07:00
Alex Dadgar
a9e3a41407 Enable more linters 2017-09-26 15:26:33 -07:00
Chelsea Holland Komlo
8943a29428 Move setGaugeForAllocationStats to emitClientMetrics 2017-09-25 16:05:49 +00:00
Alex Dadgar
98c47c72d0 changelog and feedback 2017-09-14 14:08:58 -07:00
Alex Dadgar
f23ac5f083 Non-locked accessors to common Node fields
This PR removes locking around commonly accessed node attributes that do
not need to be locked. The locking could cause nodes to TTL as the
heartbeat code path was acquiring a lock that could be held for an
excessively long time. An example of this is when Vault is inaccessible,
since the fingerprint is run with a lock held but the Vault
fingerprinter makes the API calls with a large timeout.

Fixes https://github.com/hashicorp/nomad/issues/2689
2017-09-14 14:08:26 -07:00
Chelsea Holland Komlo
1ecfb687bf fix panic in emitting tagged metrics 2017-09-11 15:32:37 +00:00
Chelsea Holland Komlo
68686cd69a final code review fixups 2017-09-05 18:47:44 +00:00
Chelsea Holland Komlo
681a3f337a fixups from code review 2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
3c0710074c labels depend on full setup of client beforehand 2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
fce72a1bc9 refactor to use baseLabels 2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
a6eeede7e2 pass in commonly used values 2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
50ab667799 create base labels to be used in every metric 2017-09-05 14:13:34 +00:00
Chelsea Holland Komlo
7a96f92290 emit metrics using labels, add option for backwards compatibility 2017-09-05 14:12:57 +00:00
Armon Dadgar
bda7b36da3 Address @dadgar feedback 2017-09-04 13:05:53 -07:00
Armon Dadgar
fb118b2dfb client: adding token cache for ACL resolution 2017-09-04 13:05:36 -07:00
Armon Dadgar
1da443f29a client: create ACL and Policy cache 2017-09-04 13:05:35 -07:00
Michael Schurter
85b9dd9cce Move migrating state into prevAllocWatcher 2017-08-14 16:02:28 -07:00
Michael Schurter
8c1811911e switch from alloc blocker to new interface
interface has 3 implementations:

1. local for blocking and moving data locally
2. remote for blocking and moving data from another node
3. noop for allocs that don't need to block
2017-08-11 16:21:35 -07:00
Michael Schurter
0f584a0143 initial attempt at refactoring blocked/migrating 2017-08-11 16:21:35 -07:00
Alex Dadgar
da82a6e814 initial watcher 2017-07-07 12:07:08 -07:00
Michael Schurter
2b97f61ac0 Consistently quote alloc ids in client logs 2017-07-06 10:24:52 -07:00
Michael Schurter
4794de99fd Tiny client race condition fix
Plus some logging improvements that may help with #2563
2017-07-05 16:15:19 -07:00
Michael Schurter
e71673e24b Suggest wiping out alloc dir too 2017-07-03 12:29:21 -07:00
Michael Schurter
6b9af8fcc3 Add more logging to restore state errors 2017-07-03 11:58:41 -07:00
Mark Mickan
e418b3e468 Add tests for migrating symlinks in alloc and local directories 2017-06-04 15:56:22 +09:30
Mark Mickan
9e984f429c Include symlinks in snapshots when migrating disks
Fixes #2685
2017-06-04 00:36:18 +09:30
Alex Dadgar
b5fbb1f4cf Fix deadlock 2017-05-31 14:05:47 -07:00
Michael Schurter
236ef21489 Merge pull request #2636 from hashicorp/f-gc-alloc-limit
Add new gc_max_allocs tuneable
2017-05-30 16:14:09 -07:00
Michael Schurter
aac319cd16 Merge pull request #2654 from hashicorp/f-env-consul
Add envconsul-like support and refactor environment handling
2017-05-30 14:40:14 -07:00
Alex Dadgar
f3218378cd Fix perms to just set exec bit 2017-05-25 14:44:13 -07:00
Michael Schurter
a96fb5dbb0 Move task env into execcontext
Also inject PATH into rkt commands since we're no longer appending host
env vars for it.
2017-05-23 13:53:34 -07:00
Michael Schurter
fb72f20bb1 gc_max_allocs should include blocked & migrating 2017-05-12 16:03:22 -07:00
Michael Schurter
cc11d9a563 Add new gc_max_allocs tuneable
More than gc_max_allocs may be running on a node, but terminal allocs
will be garbage collected to try to keep the total number below the
limit.
2017-05-11 17:18:02 -07:00
Alex Dadgar
7fb1b37e09 Fix vet errors 2017-05-11 13:08:08 -07:00
Alex Dadgar
3f1ccf7278 Respond to comments 2017-05-09 10:50:24 -07:00
Alex Dadgar
e22393aeb8 Restore state + upgrade path 2017-05-02 18:21:49 -07:00
Alex Dadgar
85a81f47de Revert "metrics"
This reverts commit 4d6a012c6f.
2017-05-02 09:28:11 -07:00
Alex Dadgar
1d20b11297 Async and sync saving of client state 2017-05-01 16:16:53 -07:00
Alex Dadgar
7dee8ae534 perf 2017-05-01 16:01:50 -07:00
Alex Dadgar
4d6a012c6f metrics 2017-05-01 14:51:27 -07:00
Alex Dadgar
7614feddbd boltDB database for client state 2017-05-01 14:50:34 -07:00
Michael Schurter
10cb924b2c Refactor Consul Syncer into new ServiceClient
Fixes #2478 #2474 #1995 #2294

The new client only handles agent and task service advertisement. Server
discovery is mostly unchanged.

The Nomad client agent now handles all Consul operations instead of the
executor handling task related operations. When upgrading from an
earlier version of Nomad existing executors will be told to deregister
from Consul so that the Nomad agent can re-register the task's services
and checks.

Drivers - other than qemu - now support an Exec method for executing
abritrary commands in a task's environment. This is used to implement
script checks.

Interfaces are used extensively to avoid interacting with Consul in
tests that don't assert any Consul related behavior.
2017-04-19 12:42:47 -07:00