Add an upgrade test workload for that continuously writes to a Nomad
Variable. In order to run this workload, we'll need to deploy a
Workload-Associated ACL policy. So this extends the `run_workloads` module to
allow for a "pre script" to be run before a given job is deployed. We can use
that as a model for other test workloads.
Ref: https://hashicorp.atlassian.net/browse/NET-12217
Add an upgrade test workload for Consul service mesh with transparent
proxy. Note this breaks from the "countdash" demo. The dashboard application
only can verify the backend is up by making a websocket connection, which we
can't do as a health check, and the health check it exposes for that purpose
only passes once the websocket connection has been made. So replace the
dashboard with a minimal nginx reverse proxy to the count-api instead.
Ref: https://hashicorp.atlassian.net/browse/NET-12217
Getting the CSI test to work with AWS EFS or EBS has proven to be awkward
because we're having to deal with external APIs with their own consistency
guarantees, as well as challenges around teardown. Make the CSI test entirely
self-contained by using a userland NFS server and the rocketduck CSI plugin.
Ref: https://hashicorp.atlassian.net/browse/NET-12217
Ref: https://gitlab.com/rocketduck/csi-plugin-nfs
Add an upgrade test workload for CSI with the AWS EFS plugin. In order to
validate this workload, we'll need to deploy the plugin job and then register a
volume with it. So this extends the `run_workloads` module to allow for "pre
scripts" and "post scripts" to be run before and after a given job has been
deployed. We can use that as a model for other test workloads.
Ref: https://hashicorp.atlassian.net/browse/NET-12217
We're using `set -eo pipefail` everywhere in the Enos scripts, several of the
scripts used for checking assertions didn't take advantage of pipefail in such a
way that we could avoid early exits from transient errors. This meant that if a
server was slightly late to come back up, we'd hit an error and exit the whole
script instead of polling as expected.
While fixing this, I've made a number of other improvements to the shell scripts:
* I've changed the design of the polling loops so that we're calling a function
that returns an exit code and sets `last_error` value, along with any global
variables required by downstream functions. This makes the loops more readable
by reducing the number of global variables, and helped identify some places
where we're exiting instead of returning into the loop.
* Using `shellcheck -s bash` I fixes some unused variables and undefined
variables that we were missing because they were only used on the error paths.
* func: add initial enos skeleton
* style: add headers
* func: change the variables input to a map of objects to simplify the workloads creation
* style: formating
* Add tests for servers and clients
* style: separate the tests in diferent scripts
* style: add missing headers
* func: add tests for allocs
* style: improve output
* func: add step to copy remote upgrade version
* style: hcl formatting
* fix: remove the terraform nomad provider
* fix: Add clean token to remove extra new line added in provision
* fix: Add clean token to remove extra new line added in provision
* fix: Add clean token to remove extra new line added in provision
* fix: add missing license headers
* style: hcl fmt
* style: rename variables and fix format
* func: remove the template step on the workloads module and chop the noamd token output on the provide module
* fix: correct the jobspec path on the workloads module
* fix: add missing variable definitions on job specs for workloads
* style: formatting
* fix: rename variable in health test