Commit Graph

1230 Commits

Author SHA1 Message Date
Alex Dadgar
c19cd2e5cf loader and singleton 2019-01-22 15:11:57 -08:00
Alex Dadgar
b9f36134dc move catalog + grpcutils 2019-01-22 15:11:57 -08:00
Mahmood Ali
a5d60fd31c api: remove MockJob from exported functions
`api.MockJob` is a test utility, that's only used by `command/agent`
package.  This moves it to the package and removes it from the public
API.
2019-01-18 14:51:31 -05:00
Michael Schurter
3a491ab2b4 Merge pull request #5187 from hashicorp/test-consul
Port a bunch of pre-0.9 Consul tests to 0.9
2019-01-15 07:41:50 -08:00
Alex Dadgar
109c5ef650 Merge pull request #5173 from hashicorp/b-log-levels
Plugins use parent loggers
2019-01-14 16:14:30 -08:00
Michael Schurter
ceee126241 Remove old comment; it's been fixed! 2019-01-14 09:56:53 -08:00
Preetha Appan
72dead7448 REfactor statedb factory config to set it directly in client config 2019-01-12 10:38:20 -06:00
Preetha Appan
80919bf713 Modified destroy failure handling to rely on allocrunner's destroy method
Added a unit test with custom statedb implementation that errors, to
use to verify destroy errors
2019-01-12 10:37:12 -06:00
Alex Dadgar
f171a723cb Enable json logs 2019-01-11 11:36:37 -08:00
Preetha Appan
7cdaf6e37d Make spread weight a pointer with default value if unset 2019-01-11 10:31:21 -06:00
Chris Baker
b12e24ec99 Merge branch 'master' of github.com:hashicorp/nomad into f-1157-validate-node-meta-variables 2019-01-09 18:56:49 +00:00
Chris Baker
8a7c09aaab increased config validation coverage for dev mode 2019-01-09 18:56:40 +00:00
Chris Baker
1dc25090a5 move if dev check into config validation, to support dev-mod
validation in the future
2019-01-08 22:21:48 +00:00
Chris Baker
cdcc1db700 refactored config validation into a new method, modified Meta.Client
tests appropriately
2019-01-08 15:07:36 +00:00
Mahmood Ali
c0162fab35 move cstructs.DeviceNetwork to drivers pkg 2019-01-08 09:11:47 -05:00
Chris Baker
64f591875d moved interp key regex out to a helper function 2019-01-08 00:11:47 +00:00
Chris Baker
baa16f72fc gofmt to make check happy 2019-01-07 18:01:59 +00:00
Chris Baker
31f82895a9 added validation on client metadata keys 2019-01-07 17:16:38 +00:00
Nick Ethier
39ca1b00dd client/drivermananger: add driver manager
The driver manager is modeled after the device manager and is started by the client.
It's responsible for handling driver lifecycle and reattachment state, as well as
processing the incomming fingerprint and task events from each driver. The mananger
exposes a method for registering event handlers for task events that is used by the
task runner to update the server when a task has been updated with an event.

Since driver fingerprinting has been implemented by the driver manager, it is no
longer needed in the fingerprint mananger and has been removed.
2018-12-18 22:55:18 -05:00
Alex Dadgar
ed4f8eac6e Add plugin API versioning to plugin loader and plugins 2018-12-18 16:48:00 -08:00
Alex Dadgar
aa59ea6ac7 fix iops bug and increase test matrix coverage 2018-12-11 15:28:21 -08:00
Mahmood Ali
51707199a6 Merge pull request #4975 from hashicorp/fix-master-20181209
Some test fixes and remedies
2018-12-11 18:00:21 -05:00
Alex Dadgar
f42c060d35 Merge pull request #4970 from hashicorp/f-no-iops
Deprecate IOPS
2018-12-11 12:51:22 -08:00
Mahmood Ali
06a4b4add2 tests: prevent indefinite blocking in some tests
Noticed few places where tests seem to block indefinitely and panic
after the test run reaches the test package timeout.

I intend to follow up with the proper fix later, but timing out is much
better than indefinitely blocking.
2018-12-11 09:35:26 -05:00
Alex Dadgar
f555dc3f67 Warn if IOPS is being used 2018-12-06 16:17:09 -08:00
Alex Dadgar
0953d913ed Deprecate IOPS
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
2018-12-06 15:09:26 -08:00
Michael Schurter
383c85ae6f consul: add ScriptExecutor context wrapper
Since d335a82859 ScriptExecutors now take
a timeout duration instead of a context. This broke the script check
removal code which used context cancelation propagation to remove
script checks while they were executing.

This commit adds a wrapper around ScriptExecutors that obeys context
cancelation again. The only downside is that it leaks a goroutine until
the underlying Exec call completes or timeouts.

Since check removal is relatively rare, check timeouts usually low, and
scripts usually fast, the risk of leaking a goroutine seems very small.
2018-12-03 20:26:31 -08:00
Michael Schurter
104bbf78d9 consul: fix script checks exiting after 1 run
Fixes a regression caused in d335a82859

The removal of the inner context made the remaining cancels cancel the
outer context and cause script checks to exit prematurely.
2018-12-03 18:50:02 -08:00
Nick Ethier
bff6484df3 Merge pull request #4906 from hashicorp/f-metric-prefix-master
Port metric prefix filtering to master
2018-11-29 22:27:47 -05:00
Nick Ethier
69e6b0ea21 nomad: fix hclog usage 2018-11-29 22:27:39 -05:00
Alex Dadgar
429c5bb885 Device hook and devices affect computed node class
This PR introduces a device hook that retrieves the device mount
information for an allocation. It also updates the computed node class
computation to take into account devices.

TODO Fix the task runner unit test. The environment variable is being
lost even though it is being properly set in the prestart hook.
2018-11-27 17:25:33 -08:00
Nick Ethier
19c260a4a5 command/agent: additional tests for telemetry config parsing 2018-11-19 23:22:33 -05:00
Nick Ethier
af3f535f0a agent: suppose filter_default telemetry option 2018-11-19 23:21:48 -05:00
Nick Ethier
4182e3e141 nomad: add flag to disable publishing of job_summary metrics for dispatched jobs 2018-11-19 23:21:19 -05:00
Preetha Appan
3cf22d2903 Pass service metadata "external-source" for consul UI integration 2018-11-16 11:28:56 -06:00
Mahmood Ali
f9295631c4 Set clean config for mock driver
The default job here contains some exec task config (for setting
command and args) that aren't used for mock driver.  Now, the alloc
runner seems stricter about validating fields and errors on unexpected
fields.

Updating configs in tests so we can have an explicit task config
whenever driver is set explicitly.
2018-11-13 10:21:40 -05:00
Mahmood Ali
2357e886ce mark and skip failing consul failing tests 2018-11-13 10:21:40 -05:00
Preetha Appan
3eeb229116 change path to v1/scheduler/configuration 2018-11-12 15:57:45 -06:00
Preetha Appan
2ec4c235be Fix failing test 2018-11-10 19:53:47 -06:00
Preetha Appan
fe41b5addc Smaller methods, and added tests for RPC layer 2018-11-10 17:37:33 -06:00
Preetha Appan
1fe9203aa6 Use response object/querymeta/writemeta in scheduler config API 2018-11-10 10:31:10 -06:00
Alex Dadgar
08b75d4120 Merge pull request #4842 from hashicorp/b-deployment-progress-deadline
Fix multiple bugs with progress deadline handling
2018-11-08 13:31:54 -08:00
Alex Dadgar
57f40c7e3e Device manager
Introduce a device manager that manages the lifecycle of device plugins
on the client. It fingerprints, collects stats, and forwards Reserve
requests to the correct plugin. The manager, also handles device plugins
failing and validates their output.
2018-11-07 10:43:15 -08:00
Michael Schurter
8122c76cd6 Merge pull request #4828 from hashicorp/b-restore
Implement client agent restarting
2018-11-05 18:50:15 -06:00
Alex Dadgar
8615b1d558 Fix multiple tgs with progress deadline handling
Fix an issue in which the deployment watcher would fail the deployment
based on the earliest progress deadline of the deployment regardless of
if the task group has finished.

Further fix an issue where the blocked eval optimization would make it
so no evals were created to progress the deployment. To reproduce this
issue, prior to this commit, you can create a job with two task groups.
The first group has count 1 and resources such that it can not be
placed. The second group has count 3, max_parallel=1, and can be placed.
Run this first and then update the second group to do a deployment. It
will place the first of three, but never progress since there exists a
blocked eval. However, that doesn't capture the fact that there are two
groups being deployed.
2018-11-05 16:06:17 -08:00
Michael Schurter
d2e48e35c0 tests: get consul integration tests building 2018-11-05 12:32:05 -08:00
Preetha Appan
f2b027797b Fix return type in tests after refactor 2018-10-30 11:10:46 -05:00
Preetha Appan
88005852e3 Introduce a response object for scheduler configuration 2018-10-30 11:06:32 -05:00
Preetha Appan
6966e3c3e8 Make preemption config a struct to allow for enabling based on scheduler type 2018-10-30 11:06:32 -05:00
Preetha Appan
784b96c104 Support for new scheduler config API, first use case is to disable preemption 2018-10-30 11:06:32 -05:00