Commit Graph

17929 Commits

Author SHA1 Message Date
Michael Lange
8808b2b475 Merge pull request #7685 from hashicorp/ui/upgrade-lint-staged
UI: Upgrade lint-staged and husky
2020-04-21 17:42:12 -07:00
Tim Gross
25944297bc changelog entries for 0.11.1 bugfixes (#7763) 2020-04-21 10:04:13 -04:00
Mahmood Ali
15b7474850 Merge pull request #7762 from hashicorp/b-in-place-update-deviceids
Perserve device ids in in-place alloc updates
2020-04-21 09:31:10 -04:00
Mahmood Ali
dacb634489 add changelog
[ci skip]
2020-04-21 09:27:40 -04:00
Mahmood Ali
04000c9ba9 Ensure that alloc updates preserve device offers
When an alloc is updated in-place, ensure that the allocated device are
preserved and carried over to new alloc.
2020-04-21 08:57:15 -04:00
Mahmood Ali
53dac68400 test for allocated devices on job in-update update
When an alloc is updated in-place, test that the allocated devices are
preserved in new alloc struct.
2020-04-21 08:56:05 -04:00
Buck Doyle
dd2e387074 Docs: correct search API (#7756)
This closes #7718. It corrects some inaccuracies and adds
an explanation of the truncations block.
2020-04-21 07:33:24 -05:00
Tim Gross
9f5156a81b csi: nil-check allocs for VolumeDenormalize and claim methods (#7760) 2020-04-21 08:32:24 -04:00
Charlie Voiselle
2c1dcc8cd2 Use ExternalID in NodeStageVolume RPC (#7754) 2020-04-20 17:13:46 -04:00
Michael Dwan
c364732c0f fix panic while deleting CSI plugins for missing job (#7758) 2020-04-20 17:13:33 -04:00
Seth Hoenig
0017601bec Merge pull request #7691 from hashicorp/docs-some-connect-bugs
docs: add bugfix notes for #7690 #7397 #7684 #7683 to changelog
2020-04-20 10:27:18 -06:00
Seth Hoenig
13b4fa9ef1 docs: add bugfix notes for #7690 #7397 #7684 #7683 to changelog 2020-04-20 10:25:57 -06:00
Seth Hoenig
a6cb4a04e5 Merge pull request #7690 from hashicorp/b-inspect-proxy-output
two fixes for inspect on connect proxy
2020-04-20 10:17:54 -06:00
Seth Hoenig
d4ebc73de9 Merge pull request #7705 from hashicorp/docs-remove-connect-limitation
fixup references in connect docs
2020-04-20 10:15:50 -06:00
Mahmood Ali
5abc59284f Merge pull request #7704 from hashicorp/b-agent-shutdown-order
agent: shutdown agent http server last
2020-04-20 10:37:26 -04:00
Mahmood Ali
c4c6b2758a Merge pull request #7748 from hashicorp/b-noisy-http-logs
agent: route http logs through hclog
2020-04-20 10:37:15 -04:00
Mahmood Ali
5a5354a86f update changelog
[ci skip]
2020-04-20 10:36:39 -04:00
Mahmood Ali
360e0a1669 agent: route http logs through hclog
Pipe http server log to hclog, so that it uses the same logging format
as rest of nomad logs.  Also, supports emitting them as json logs, when
json formatting is set.

The http server logs are emitted as Trace level, as they are typically
repsent HTTP client errors (e.g. failed tls handshakes, invalid headers,
etc).

Though, Panic logs represent server errors and are relayed as Error
level.
2020-04-20 10:33:40 -04:00
Mahmood Ali
68d0a9ef4c Merge pull request #7749 from hashicorp/b-docker-panic
driver/docker: protect against nil container
2020-04-20 10:31:46 -04:00
Mahmood Ali
7f29912a02 add changelog
[ci skip]
2020-04-20 10:31:09 -04:00
Jeffrey 'jf' Lim
482804dac5 demo/vagrant/Vagrantfile: Update Nomad version (0.11.0) (#7579) 2020-04-20 09:29:12 -04:00
Anthony Scalisi
e1287846ae fix spelling errors (#6985) 2020-04-20 09:28:19 -04:00
Charles Z
a4621a7c89 label csi as beta from 0.11 release notes (#7745) 2020-04-20 08:48:04 -04:00
Mahmood Ali
9db46fde84 driver/docker: protect against nil container
Protect against a panic when we attempt to start a container with a name
that conflicts with an existing one.  If the existing one is being
deleted while nomad first attempts to create the container, the
createContainer will fail with `container already exists`, but we get
nil container reference from the `containerByName` lookup, and cause a
crash.

I'm not certain how we get into the state, except for being very
unlucky.  I suspect that this case may be the result of a concurrent
restart or the docker engine API not being fully consistent (e.g. an
earlier call purged the container, but docker didn't free up resources
yet to create a new container with the same name immediately yet).

If that's the case, then re-attempting creation will hopefully succeed,
or we'd at least fail enough times for the alloc to be rescheduled to
another node.
2020-04-19 15:34:45 -04:00
Jeffrey 'jf' Lim
71744bcc2d Fix/improve "job plan" messaging (#7580) 2020-04-17 15:53:16 -04:00
Yishan Lin
095c2a9890 Merge pull request #7741 from hashicorp/yishan/docs-rebased-preemption-update
docs: update preemption page
2020-04-17 11:03:27 -07:00
Yishan Lin
ffddd697c4 docs: update preemption page
This page has not been updated (yet) to reflect that support for all 3 job types (service, batch, system) which shipped in 0.9.2.

The current page implies that preemption is only available for system jobs.

This is early preparation for Nomad 0.12, where we plan to move Preemption from Enterprise feature suite to OSS for all.
2020-04-17 09:34:07 -07:00
Brandon Romano
417f50f925 Merge pull request #7717 from hashicorp/website-alert
website: Adjust the website alert to point to the blog post
2020-04-14 11:36:43 -07:00
Brandon Romano
41fe302923 Adjust the website alert to point to the blog post 2020-04-14 11:17:06 -07:00
Michael Schurter
81db9e9156 Merge pull request #7682 from hashicorp/b-comment-fix
core: fix comment on system stack
2020-04-13 15:13:23 -07:00
Seth Hoenig
eef81c3b4f structs: fix compatibility between api and nomad/structs proxy definitions
The field names within the structs representing the Connect proxy
definition were not the same (nomad/structs/ vs api/), causing the
values to be lost in translation for the 'nomad job inspect' command.

Since the field names already shipped in v0.11.0 we cannot simply
fix the names. Instead, use the json struct tag on the structs/ structs
to remap the name to match the publicly expose api/ package on json
encoding.

This means existing jobs from v0.11.0 will continue to work, and
the JSON API for job submission will remain backwards compatible.
2020-04-13 15:59:45 -06:00
Seth Hoenig
dc11226763 jobspec: correctly parse proxy fields from jobspec
Before, the proxy stanza did not parse non-object fields
`local_service_port` and `local_service_address` from the
connect `proxy` stanza. This change fixes that.
2020-04-13 15:59:45 -06:00
Chris Baker
a8ed119496 documents the scaling block in the JSON Job docs (#7706)
* documents the scaling block in the JSON Job docs

resolves #7656

* add task-specific restart to JSON Job docs

companion to #7603

* [docs] improved and corrected scaling docs

* Update website/pages/api-docs/json-jobs.mdx

Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2020-04-13 16:33:49 -05:00
Chris Baker
bea905264b update restart documentation (#7603)
* update `restart` documentation

#7288 added support for task-specific `restart` policy. this PR updates the docs to reflect that.

* added an explicit example of task-specific restart policy

* Update website/pages/docs/job-specification/restart.mdx
2020-04-13 16:29:43 -05:00
Drew Bailey
aab5233116 Merge pull request #7663 from hashicorp/b-taskrunner-shutdown_delay
Run task shutdown_delay regardless of service registration
2020-04-13 13:27:24 -04:00
Drew Bailey
f207abdc63 Update CHANGELOG.md 2020-04-13 12:41:13 -04:00
Seth Hoenig
9332a87eb7 docs: add a link to the Connect w/ACLs guide
... from the docs/integration/consul-connect page.
2020-04-13 10:05:20 -06:00
Seth Hoenig
d8520cdf4f docs: update connect limitations (acls & checks now supported) 2020-04-13 09:51:17 -06:00
Mahmood Ali
d89687d014 agent: shutdown agent http server last
Shutdown http server last, after nomad client/server components
terminate.

Before this change, if the agent is taking an unexpectedly long time to
shutdown, the operator cannot query the http server directly: they
cannot access agent specific http endpoints and need to query another
agent about the troublesome agent.

Unexpectedly long shutdown can happen in normal cases, e.g. a client
might hung is if one of the allocs it is running has a long
shutdown_delay.

Here, we switch to ensuring that the http server is shutdown last.

I believe this doesn't require extra care in agent shutting down logic
while operators may be able to submit write http requests.  We already
need to cope with operators submiting these http requests to another
agent or by servers updating the client allocations.
2020-04-13 10:50:07 -04:00
Tim Gross
092456b93f refactor: consolidate private methods for CSI RPC (#7702)
Follow-up for a method missed in the refactor for #7688. The
`volAndPluginLookup` method is only ever called from the server's `CSI`
RPC and never the `ClientCSI` RPC, so move it into that scope.
2020-04-13 10:46:43 -04:00
Tim Gross
cafdcc9216 e2e: testing reliability (#7701)
* pin CSI plugin versions
* ensure failing CSI tests clean up
* allow NOMAD_SHA env var to override makefile
2020-04-13 10:25:24 -04:00
Mahmood Ali
5fbb837556 Merge pull request #7693 from greut/bump-testify
api: testify v1.5.1
2020-04-11 09:09:44 -04:00
Yoan Blanc
def809b807 api: testify v1.5.1
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-04-11 13:55:10 +02:00
Tim Gross
09abe0c702 refactor: make nodeForControllerPlugin private to ClientCSI (#7688)
The current design of `ClientCSI` RPC requires that callers in the
server know about the free-standing `nodeForControllerPlugin`
function. This makes it difficult to send `ClientCSI` RPC messages
from subpackages of `nomad` and adds a bunch of boilerplate to every
server-side caller of a controller RPC.

This changeset makes it so that the `ClientCSI` RPCs will populate and
validate the controller's client node ID if it hasn't been passed by
the caller, centralizing the logic of picking and validating
controller targets into the `nomad.ClientCSI` struct.
2020-04-10 16:47:21 -04:00
Seth Hoenig
47dfa762b3 Merge pull request #7684 from hashicorp/b-connect-sidecar-name
connect: enable configuring sidecar_task.name
2020-04-10 10:04:25 -06:00
Seth Hoenig
ce3b57e100 Merge pull request #7683 from hashicorp/b-no-sidecar-panic
connect: correctly handle missing sidecar_service task stanza
2020-04-10 09:49:59 -06:00
Seth Hoenig
0eb2844900 connect: extract common task keys 2020-04-10 09:49:19 -06:00
Drew Bailey
e2886582fb changelog 2020-04-10 11:14:39 -04:00
Drew Bailey
3af2d05f6b Run task shutdown_delay regardless of service registration
task shutdown_delay will currently only run if there are registered
services for the task. This implementation detail isn't explicity stated
anywhere and is defined outside of the service stanza.

This change moves shutdown_delay to be evaluated after prekill hooks are
run, outside of any task runner hooks.

just use time.sleep
2020-04-10 11:06:26 -04:00
Michael Lange
11cbd282e0 Remove now superfluous lint-staged arguments 2020-04-09 20:46:32 -07:00