Commit Graph

20494 Commits

Author SHA1 Message Date
Kris Hicks
2cd7136bc7 Fix some errcheck errors (#9811)
* Throw away result of multierror.Append

When given a *multierror.Error, it is mutated, therefore the return
value is not needed.

* Simplify MergeMultierrorWarnings, use StringBuilder

* Hash.Write() never returns an error

* Remove error that was always nil

* Remove error from Resources.Add signature

When this was originally written it could return an error, but that was
refactored away, and callers of it as of today never handle the error.

* Throw away results of io.Copy during Bridge

* Handle errors when computing node class in test
2021-01-14 12:46:35 -08:00
Kris Hicks
35f831c07d csi: Return error when deleting node (#9803)
In this change we'll properly return the error in the
CSIPluginTypeMonolith case (which is the type given in DeleteNode()),
and also return the error when the given ID is not found.

This was found via errcheck.
2021-01-14 12:44:50 -08:00
Kris Hicks
c52e0bbf41 gatedwriter: Fix race condition (#9791)
If one thread calls `Flush()` on a gatedwriter while another thread attempts to
`Write()` new data to it, strange things will happen.

The test I wrote shows that at the very least you can write _while_ flushing,
and the call to `Write()` will happen during the internal writes of the
buffered data, which is maybe not what is expected. (i.e. the `Write()`'d data
will be inserted somewhere in the middle of the data being `Flush()'d`)

It's also the case that, because `Write()` only has a read lock, if you had
multiple threads trying to write ("read") at the same time you might have data
loss because the `w.buf` that was read would not necessarily be up-to-date by
the time `p2` was appended to it and it was re-assigned to `w.buf`. You can see
this if you run the new gatedwriter tests with `-race` against the old implementation:

```
WARNING: DATA RACE
Read at 0x00c0000c0420 by goroutine 11:
  runtime.growslice()
      /usr/lib/go/src/runtime/slice.go:125 +0x0
  github.com/hashicorp/nomad/helper/gated-writer.(*Writer).Write()
      /home/hicks/workspace/nomad/helper/gated-writer/writer.go:41 +0x2b6
  github.com/hashicorp/nomad/helper/gated-writer.TestWriter_WithMultipleWriters.func1()
      /home/hicks/workspace/nomad/helper/gated-writer/writer_test.go:90 +0xea
```

This race condition is fixed in this change.
2021-01-14 12:43:14 -08:00
Kris Hicks
054e53afbf Refactor Job.Scale() (#9771) 2021-01-14 12:40:42 -08:00
Kris Hicks
360a6500e6 Add missing sink.Cancel() in fsm (#9818) 2021-01-14 12:39:20 -08:00
Drew Bailey
893777a8fd bump website version (#9820) 2021-01-14 15:12:39 -05:00
Drew Bailey
9198ee3456 Release 1.0.2 (#9819)
* changelog for release 1.0.2

* Generate files for 1.0.2 release

* Release v1.0.2

* rm generated files, update changelog for next release

* checkout bindata_assetfs

* bump version

Co-authored-by: Nomad Release bot <nomad@hashicorp.com>
2021-01-14 15:08:28 -05:00
Brandon Romano
42159636c3 Merge pull request #9805 from hashicorp/br.stack-menu
Website StackMenu updates for 1/14
2021-01-14 09:31:54 -08:00
Mahmood Ali
77c4cff622 ci: only read/modify GO_TAGS field (#9815)
Only lookup GO_TAGS variable, and avoid the false positives where GO_TAGS is a variable suffix.
2021-01-14 08:16:58 -05:00
Drew Bailey
69b19679e8 changelogfmt (#9807) 2021-01-13 15:21:17 -05:00
Seth Hoenig
f8bbb679eb Merge pull request #9809 from hashicorp/f-use-jobspec2-in-e2eutil
e2e: use jobspec2 Parse for parsing jobfile in e2e utils
2021-01-13 14:14:34 -06:00
Seth Hoenig
74c1828431 e2e: use jobspec2 Parse for parsing jobfile in e2e utils
We directly parse job files in e2eutil, but currently using jobspec
package. Instead, use the Parse method from the jobspec2 package so
we can parse job files with new features.
2021-01-13 14:00:40 -06:00
Brandon Romano
6342d6b68d Website StackMenu updates for 1/14 2021-01-13 10:21:55 -08:00
Tim Gross
adf3b410b7 lifecycle: successful prestart tasks should not fail deployments
In 492d62d we prevented poststop tasks from contributing to allocation health
status, which fixed a bug where poststop tasks would prevent a deployment from
ever being marked successful. The patch introduced a regression where prestart
tasks that complete are causing the allocation to be marked unhealthy. This
changeset restores the previous behavior for prestart tasks.
2021-01-13 11:40:21 -05:00
Luiz Aoqui
9cee828673 Merge pull request #9801 from hashicorp/docs-fix-broken-link-in-hcl2
docs: fix broken link
2021-01-13 11:32:18 -05:00
Luiz Aoqui
39ed3593bf docs: fix broken link 2021-01-13 11:25:48 -05:00
Luiz Aoqui
8b039fd06f Merge pull request #9799 from hashicorp/docs-fix-hcl2-codeblock
docs: fix HCL2 doc page code block
2021-01-13 11:16:38 -05:00
Luiz Aoqui
5eacec0265 docs: fix HCL2 doc page code block 2021-01-13 11:10:45 -05:00
Mahmood Ali
852fc10e8e build binaries with UI enabled (#9796)
Have the build-binary bundle the UI by default. This eases getting "alpha pre-releases" out for support without compiling locally, and engineer's experience with e2e test clusters.
2021-01-13 10:56:25 -05:00
Dave May
332a195ed4 nomad agent-info: Add json/gotemplate formatting (#9788)
* nomad agent-info: Add json/gotemplate formatting
* Add CHANGELOG entry
* update docs
2021-01-13 09:42:46 -05:00
Tim Gross
26bf0257de docs: podman FSIsolation is image
As of podman 0.2.0, podman correctly advertises its filesystem isolation as
`FSIsolationImage`.
2021-01-13 09:05:19 -05:00
Tim Gross
d8f1e811f7 docs: remove remaining references to network_speed config 2021-01-13 08:52:25 -05:00
Drew Bailey
1d2ac5bbb9 tmp remove darwin arm build (#9786) 2021-01-12 15:52:30 -05:00
Jasmine Dahilig
75d5de98d0 changelog for #9361 (#9783) 2021-01-12 15:12:49 -05:00
Kris Hicks
f4dd0e4aa2 makefile: Set CC explicitly in go build (#9784)
This is required because Go does not pull CC from the make variable. This uses
whatever Go's default CC unless CC is overridden, as it is for the ARM targets.

This also makes it easier to build Nomad on a native ARM device, via:

```
make CC= pkg/linux_arm/nomad
```
2021-01-12 12:09:40 -08:00
Michael Lange
cb277dc715 Merge pull request #9780 from hashicorp/d/changelog-9733
Changelog entry for 9733
2021-01-12 10:34:52 -08:00
Seth Hoenig
7b1a2119d0 Merge pull request #9770 from hashicorp/docs-update-cl
docs: update cl with graviton fix
2021-01-12 12:30:01 -06:00
Seth Hoenig
70b5302f58 Merge pull request #9779 from apollo13/fix_9776
Properly detect unloaded dynamic modules on RHEL derivates. Fixes #9776
2021-01-12 12:25:30 -06:00
Drew Bailey
492d62d3df ignore poststop task in alloc health tracker (#9548), fixes #9361
* investigating where to ignore poststop task in alloc health tracker

* ignore poststop when setting latest start time for allocation

* clean up logic

* lifecycle: isolate mocks for poststop deployment test

* lifecycle: update comments in tracker

Co-authored-by: Jasmine Dahilig <jasmine@dahilig.com>
2021-01-12 10:03:48 -08:00
Michael Lange
34ebaacf03 Changelog entry for 9733 2021-01-12 09:56:02 -08:00
Florian Apolloner
deb835792c Properly detect unloaded dynamic modules on RHEL derivates. Fixes #9776
The modules.dep file on RHEL includes .xz for compressed kernel modules.
2021-01-12 18:28:00 +01:00
Seth Hoenig
efe573bec1 docs: update cl with graviton fix 2021-01-11 12:07:05 -06:00
James Rasell
0514e8d3af Merge pull request #9767 from hashicorp/f-e2e-job-scaling-suite
e2e: add job scaling test suite.
2021-01-11 18:35:07 +01:00
Tim Gross
d650863dda safely handle existing net namespace in default network manager
When a client restarts, the network_hook's prerun will call
`CreateNetwork`. Drivers that don't implement their own network manager will
fall back to the default network manager, which doesn't handle the case where
the network namespace is being recreated safely. This results in an error and
the task being restarted for `exec` tasks with `network` blocks (this also
impacts the community `containerd` and probably other community task drivers).

If we get an error when attempting to create the namespace and that error is
because the file already exists and is locked by its process, then we'll
return a `nil` error with the `created` flag set to false, just as we do with
the `docker` driver.
2021-01-11 11:31:03 -05:00
Seth Hoenig
556f9c54dd Merge pull request #9765 from hashicorp/f-bump-connect-examples
command: bump connect examples to v3
2021-01-11 10:22:58 -06:00
Seth Hoenig
b552f08dd8 Merge pull request #9766 from hashicorp/f-bump-cni-plugins-version
cni: bump CNI plugins version to v0.9.0
2021-01-11 09:59:43 -06:00
Tim Gross
f52cab8be6 e2e: remove deprecated terraform syntax
Also bumps patch versions of some TF modules
2021-01-11 08:25:22 -05:00
James Rasell
20ea8c64a6 e2e: add job scaling test suite. 2021-01-11 11:34:19 +01:00
Seth Hoenig
143af9b67f cni: bump CNI version to v0.9.0
https://github.com/containernetworking/plugins/releases/tag/v0.9.0

Also make the copy-paste install instructions work with arm64 for
a better OOTB experience (AWS Graviton, Pi 4's).
2021-01-10 18:03:27 -06:00
Seth Hoenig
d1a8468b5a docs: update countdash examples to v3 2021-01-10 17:19:39 -06:00
Seth Hoenig
a92cc30e47 command: generate bindata assetfs 2021-01-10 17:09:08 -06:00
Seth Hoenig
4cf3ad161f command: bump connect examples to v3
Nomad v1.0+ combined with Consul 1.9+ support launching Envoy v1.16+
which is the first version of envoy to support arm64 platforms out
of the box.

By rebuilding our example docker containers for connect to be multiplatform
between amd64 and arm64, Nomad can provide a nicer user experience for
those trying out Connect on arm64 machines (e.g. AWS Graviton instances
or Raspberry Pi 4's).

This has been done for the countdash examples at v3.

https://hub.docker.com/layers/hashicorpnomad/counter-dashboard/v3/images/sha256-94e323587bc372ba1b6ca5c58dc23e291e9d26787b50e71025f1c8967dfbcd07?context=repo
https://hub.docker.com/layers/hashicorpnomad/counter-api/v3/images/sha256-16a9e9e08082985a635c9edd0f258b084153c6c7831a9b41d34bde78c308b65c?context=repo

The connect-native examples are now also multiplatform at v5, but we
don't have them built into `job init`.
2021-01-10 16:54:31 -06:00
Chris Baker
d234000e87 Merge pull request #9761 from hashicorp/b-9758-enforce-policy-on-scale
in Job.Scale, ensure that new count is within [min,max] configured in  scaling policy
2021-01-08 15:49:38 -06:00
Chris Baker
cb1c0181be nicer error message 2021-01-08 21:13:29 +00:00
Jeff Escalante
cf4534c2a2 update dependencies (#9760) 2021-01-08 15:46:31 -05:00
Buck Doyle
017b47dfb4 Add documentation for exec websocket (#9679) 2021-01-08 14:01:06 -06:00
Chris Baker
59271c668e appease the linter and fix an incorrect test 2021-01-08 19:38:25 +00:00
Chris Baker
9087e0be99 changelog for 9761 2021-01-08 19:26:42 +00:00
Chris Baker
ebd28c527c in Job.Scale, ensure that new count is within [min,max] configured in scaling policy
resolves #9758
2021-01-08 19:24:36 +00:00
Drew Bailey
9c3ce6b6dc persist shared ports during inplace updates (#9736)
AllocatedSharedResources were not being copied over to the new
allocation struct the scheduler makes during inplace updates. This
caused downstream issues after the plan was applied, namely the shared
ports were dropped causing issues with service
registration/deregistration.

test that shared ports are preserved

change log, also carry over shared network

copy networks
2021-01-08 09:00:41 -05:00