Commit Graph

22925 Commits

Author SHA1 Message Date
James Rasell
61ec5f0456 autopilot: correctly return errors within state functions. (#12714) 2022-04-21 08:54:50 +02:00
Luiz Aoqui
7f1b838abb ui: fix bug that prevented files streaming (#12719)
During the Ember dependecy upgrade work,
https://github.com/hashicorp/nomad/commit/ce8c039f4ce7359d60ede5dee36b9cef82
moved the `isSupported` method from using Ember's `reopenClass` to a
getter, but `reopenClass` creates a static method, so the getter must be
static as well.
2022-04-20 14:39:18 -04:00
Gowtham
f601cc39b1 Add Concurrent Download Support for artifacts (#11531)
* add concurrent download support - resolves #11244

* format imports

* mark `wg.Done()` via `defer`

* added tests for successful and failure cases and resolved some goleak

* docs: add changelog for #11531

* test typo fixes and improvements

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2022-04-20 10:15:56 -07:00
James Rasell
8eb569faf4 job_hooks: add implicit constraint when using Consul for services. (#12602) 2022-04-20 14:09:13 +02:00
James Rasell
4c55339cc6 client: add NOMAD_SHORT_ALLOC_ID allocation env var. (#12603) 2022-04-20 10:30:48 +02:00
Tim Gross
aafcf97984 E2E: provide options for reverse proxy for web UI (#12671)
Our E2E test environment is deployed with mTLS, but it's impractical
for us to use mTLS in headless browsers for automated testing (or even
in manual testing). Provide certificates for proxying the web UI via
Nginx. This proxy uses client certs for proxying to the HTTP endpoint
and a self-signed cert for the browser-facing endpoint. We can accept
certificate errors in the automated tests we'll be adding in the next
step of this work.
2022-04-19 16:55:05 -04:00
Tim Gross
e2a8d45f2d E2E: terraform provisioner upgrades (#12652)
While working on infrastructure for testing the UI in E2E, we needed
to upgrade the certificate provider. Performing a provider upgrade via
the TF `init -upgrade` brought in updates for the file and AWS
providers as well. These updates include deprecating the use of
`sensitive_content` fields, removing CA algorithm parameters that can
be inferred from keys, and removing the requirement to manually
specify AWS assume role parameters in the provider config if they're
available in the calling environment's AWS config file (as they are
via doormat or our E2E environment).
2022-04-19 14:27:14 -04:00
Seth Hoenig
19c9779d57 Merge pull request #12604 from hashicorp/b-fixup-chroot-test
ci: fixup task runner chroot test
2022-04-19 12:58:03 -05:00
Seth Hoenig
121d959745 Merge pull request #12622 from hashicorp/b-fix-docker-logger-test
ci: fix docker logger not supported test
2022-04-19 12:57:47 -05:00
Seth Hoenig
a6f345c8f5 ci: fixup task runner chroot test
This PR is 2 fixes for the flaky TestTaskRunner_TaskEnv_Chroot test.

And also the TestTaskRunner_Download_ChrootExec test.

- Use TinyChroot to stop copying gigabytes of junk, which causes GHA
to fail to create the environment in time.

- Pre-create cgroups on V2 systems. Normally the cgroup directory is
managed by the cpuset manager, but that is not active in taskrunner tests,
so create it by hand in the test framework.
2022-04-19 10:37:46 -05:00
Seth Hoenig
cbb09f31a5 ci: fix docker logger not supported test
This test checks for behavior when asking for logs of a docker task
configured with a log driver that does not support streaming logs.

Previously this was using the 'gelf' log driver, but it seems that no
longer returns an error as expected. Instead we can just use the 'none'
log driver, which has the desired effect

2022-04-19T10:23:19.129-0500 [ERROR] docklog/docker_logger.go:133: log streaming ended with terminal error: error="API error (501): configured logging driver does not support reading"
2022-04-19 10:27:01 -05:00
Luiz Aoqui
2319a081b9 changelog: fix entry for #11927 (#12577) 2022-04-19 10:46:25 -04:00
Luiz Aoqui
997252622d changelog: add entry for #11944 (#12578) 2022-04-19 10:46:11 -04:00
Seth Hoenig
3e394fce69 Merge pull request #12586 from hashicorp/f-local-si-token
connect: create SI tokens in local scope
2022-04-19 07:53:01 -05:00
Seth Hoenig
dd2724a8ab cl: add missing prefix 2022-04-19 07:48:56 -05:00
Derek Strickland
9d7ea218bb consul-template: revert function_denylist logic (#12071)
* consul-template: replace config rather than append
Co-authored-by: Seth Hoenig <seth.a.hoenig@gmail.com>
2022-04-18 13:57:56 -04:00
chavacava
334c25834a QueryOptions.SetTimeToBlock should take pointer receiver
Fixes a bug where blocking queries that are retried don't have their blocking 
timeout reset, resulting in them running longer than expected.
2022-04-18 10:41:27 -04:00
Tim Gross
5628caee53 CI: build binaries for UI branches (#12594)
Build binaries for every code change, not just backend code
changes. This means that we'll have up-to-date compiled assets for
every commit available in CircleCI artifacts.
2022-04-18 10:29:20 -04:00
Seth Hoenig
b2a2f77d40 docs: update documentation with connect acls changes
This PR updates the changelog, adds notes the 1.3 upgrade guide, and
updates the connect integration docs with documentation about the new
requirement on Consul ACL policies of Consul agent default anonymous ACL
tokens.
2022-04-18 08:22:33 -05:00
Jorge Marey
7bfb482b1e Change consul SI tokens to be local 2022-04-18 08:22:33 -05:00
Shishir
c86642bae4 Add os to NodeListStub struct. (#12497)
* Add os to NodeListStub struct.

Signed-off-by: Shishir Mahajan <smahajan@roblox.com>

* Add os as a query param to /v1/nodes.

Signed-off-by: Shishir Mahajan <smahajan@roblox.com>

* Add test: os as a query param to /v1/nodes.

Signed-off-by: Shishir Mahajan <smahajan@roblox.com>
2022-04-15 17:22:45 -07:00
Tim Gross
0c2732ddce CSI: replace structs->api with serialization extension (#12583)
The CSI HTTP API has to transform the CSI volume to redact secrets,
remove the claims fields, and to consolidate the allocation stubs into
a single slice of alloc stubs. This was done manually in #8590 but
this is a large amount of code and has proven both very bug prone
(see #8659, #8666, #8699, #8735, and #12150) and requires updating
lots of code every time we add a field to volumes or plugins.

In #10202 we introduce encoding improvements for the `Node` struct
that allow a more minimal transformation. Apply this same approach to
serializing `structs.CSIVolume` to API responses.

Also, the original reasoning behind #8590 for plugins no longer holds
because the counts are now denormalized within the state store, so we
can simply remove this transformation entirely.
2022-04-15 14:29:34 -04:00
Tim Gross
acc6baa5fd CSI: fix volume status prefix matching in CLI (#12584)
The API for `CSIVolume.List` sorts by created index and not by ID,
which breaks the logic for prefix matching in the `volume status`
output when the prefix is also an exact match. Ensure that we're
handling this case correctly.
2022-04-15 14:16:30 -04:00
Kevin Wang
dbf269f6bd chore: redirects (#12560)
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2022-04-15 13:13:40 -04:00
Derek Strickland
b43de99ac1 heartbeat: Handle transitioning from disconnected to down (#12559) 2022-04-15 09:47:45 -04:00
Derek Strickland
f5de802993 system_scheduler: support disconnected clients (#12555)
* structs: Add helper method for checking if alloc is configured to disconnect
* system_scheduler: Add support for disconnected clients
2022-04-15 09:31:32 -04:00
Tim Gross
fd21cebec7 CSI: handle per-alloc volumes in alloc status -verbose CLI (#12573)
The Nomad client's `csi_hook` interpolates the alloc suffix with the
volume request's name for CSI volumes with `per_alloc = true`, turning
`example` into `example[1]`. We need to do this same behavior in the
`alloc status` output so that we show the correct volume.
2022-04-15 09:26:19 -04:00
Seth Hoenig
04f6b0aa43 Merge pull request #12579 from hashicorp/ci-missing-packages-oss
ci: ensure package coverage of test-core
2022-04-15 08:11:41 -05:00
Lars Lehtonen
f1ec08cb28 command/agent: check err before close (#12574) 2022-04-15 08:54:03 -04:00
Seth Hoenig
72a4677415 ci: ensure package coverage of test-core 2022-04-14 19:04:06 -05:00
Michael Schurter
19bac3caa8 docs: add plan for node rejected details and more (#12564)
- Moved federation docs to the bottom since *everyone* is potentially
  affected by the other sections on the page, but only users of
  federation are affected by it.
- Added section on the plan for node rejected bug since it is fairly
  easy to diagnose and removing affected nodes is a fairly reliable
  workaround.
- Mention 5s cliff for wait_for_index.
- Remove the lie that we do not have job status metrics! How old was
  that?!
- Reinforce the importance of monitoring basic system resources
2022-04-14 16:09:33 -07:00
Tim Gross
4ca980311c E2E: add debugging outputs for disconnected clients test (#12572)
This test has a failure that's happening only occassionally and not
very reproducibly. Print out the allocation status on test failure so
that we can do some post-mortum debugging of the test on nightly.
2022-04-14 17:03:57 -04:00
Tim Gross
33cc69cdda ui: remove beta tag from gutter menu for CSI (#12570) 2022-04-14 14:56:04 -04:00
Tim Gross
d2aab5d53d fix data race in dynamic plugin registry tests (#12554)
These tests have a data race where the test assertion is reading a
value that's being set in the `listenFunc` goroutines that are
subscribing to registry update events. Move the assertion into the
subscribing goroutine to remove the race. This bug was discovered
in #12098 but does not impact production Nomad code.
2022-04-14 14:55:56 -04:00
Seth Hoenig
6e0e423b98 Merge pull request #12543 from idrennanvmware/add-allocid-to-sidecar
Add alloc_id to sidecar bootstrap
2022-04-14 13:27:09 -05:00
Luiz Aoqui
6cb520cee0 ci: fix backport target branch pattern (#12571) 2022-04-14 14:12:41 -04:00
Seth Hoenig
f2ea1fab5a connect: prefix tag with nomad.; merge into envoy_stats_tags; update docs
This PR expands on the work done in #12543 to
- prefix the tag, so it is now "nomad.alloc_id" to be more consistent with Consul tags
- merge into pre-existing envoy_stats_tags fields
- update the upgrade guide docs
- update changelog
2022-04-14 12:52:52 -05:00
Ian Drennan
5ca35cf49d Add alloc_id to sidecar bootstrap 2022-04-14 11:46:06 -05:00
Michael Schurter
29af9891f8 test: test the buffered pipe used by nsd (#12563)
Nomad Service Discovery uses an in-memory buffered pipe implementation
to connect consul-template to the Nomad API.

This adds a basic test for that helper functionality.
2022-04-14 08:38:25 -07:00
James Rasell
281ce5ed21 jobspec: add max_client_disconnect to hcl1 group parsing. (#12568) 2022-04-14 14:56:58 +02:00
Derek Strickland
8f7abae89f Update E2E terraform output command (#12561) 2022-04-13 16:46:09 -04:00
James Rasell
281a0fb38e service discovery: add pagination and filtering support to info requests (#12552)
* services: add pagination and filter support to info RPC.
* cli: add filter flag to service info command.
* docs: add pagination and filter details to services info API.
* paginator: minor updates to comment and func signature.
2022-04-13 07:41:44 +02:00
claire labry
36c89f61bb updates for backport assistant (#12311) 2022-04-12 14:01:19 -04:00
Tim Gross
9d5b3bcc53 CSI: fix data race in plugin manager (#12553)
The plugin manager for CSI hands out instances of a plugin for callers
that need to mount a volume. The `MounterForPlugin` method accesses
the internal instances map without a lock, and can be called
concurrently from outside the plugin manager's main run-loop.

The original commit for the instances map included a warning that it
needed to be accessed only from the main loop but that comment was
unfortunately ignored shortly thereafter, so this bug has existed in
the code for a couple years without being detected until we ran tests
with `-race` in #12098. Lesson learned here: comments make for lousy
enforcement of invariants!
2022-04-12 12:18:04 -04:00
Luiz Aoqui
8dec033bd6 add some godocs for the API pagination tokenizer options (#12547) 2022-04-12 10:27:22 -04:00
Tim Gross
247e20e10b scripts: fix interpreter for bash (#12549)
Many of our scripts have a non-portable interpreter line for bash and
use bash-specific variables like `BASH_SOURCE`. Update the interpreter
line to be portable between various Linuxes and macOS without
complaint from posix shell users.
2022-04-12 10:08:21 -04:00
Tim Gross
86ca8f7e73 E2E: fix flaky event stream test (#12548)
This changeset fixes two sources of flakiness in the event stream test.

First, the stream request gets the event *closest* to the index, not
the exact match. Although events are written before raft entries
they're written asynchronously, so it's possible to race and get a
raft index from this query higher than the current head of the event
buffer. Ensure the job is running before we try to get the index, so
that we've given the event enough time to land in the buffer.

Second, the assertion that the found index is greater than the start
index is only true if the `PlanResult` event manages to land before we
do the second registration. Although it should now with the first fix
above, it's not a correct assertion for what we're testing.
2022-04-12 08:35:39 -04:00
Luiz Aoqui
8bde164eaa ci: change notification channel to feed-nomad-releases (#12550) 2022-04-11 19:12:58 -04:00
claire labry
5a0a8f606f move nomad.service out of etc (#12541) 2022-04-11 18:26:10 -04:00
Seth Hoenig
24eb703e74 Merge pull request #12532 from greut/feat/remove-consul-lib
feat: remove dependency to consul/lib
2022-04-11 13:52:05 -05:00