* e2e: add tests for using private registry with podman driver
This PR adds e2e tests that stands up a private docker registry
and has a podman tasks run a container from an image in that private
registry.
Tests
- user:password set in task config
- auth_soft_fail works for public images when auth is set in driver
- credentials helper is set in driver auth config
- config auth.json file is set in driver auth config
* packer: use nomad-driver-podman v0.5.0
* e2e: eliminate unnecessary chmod
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
* cr: no need to install nomad twice
* cl: no need to install docker twice
---------
Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
the windows docker install script stopped working.
after trying various things to fix the script,
I opted instead for a base image that comes with
docker already installed.
error output during build was:
Installing Docker.
WARNING: Cannot find path 'C:\Users\Administrator\AppData\Local\Temp\DockerMsftProvider\DockerDefault_DockerSearchIndex.json' because it does not exist.
WARNING: Cannot bind argument to parameter 'downloadURL' because it is an empty string.
WARNING: The property 'AbsoluteUri' cannot be found on this object. Verify that the property exists.
WARNING: The property 'RequestMessage' cannot be found on this object. Verify that the property exists.
Failed to install Docker.
Install-Package : No match was found for the specified search criteria and package name 'docker'.
* e2e: cleanup podman installation in jammy image
The original steps were copied over from the bionic image and does a lot
of hoop jumping we do not need anymore.
For the moment just hard-code installing the v0.4.2 version of the driver,
but I may follow up and modify hc-install to support installing @latest
like go itself.
* use releases for hc-install
This changeset provides a matrix test of ACL enforcement across several
dimensions:
* anonymous vs bogus vs valid tokens
* permitted vs not permitted by policy
* request sent to server vs sent to client (and forwarded)
This PR configures
- server nodes with a systemd unit running the agent as the nomad service user
- client nodes with a root owned nomad data directory
Add an Elastic Network Interface (ENI) to each Linux host, on a secondary subnet
we have provisioned in each AZ. Revise security groups as follows:
* Split out client security groups from servers so that we can't have clients
accidentally accessing serf addresses or other unexpected cross-talk.
* Add new security groups for the secondary subnet that only allows
communication within the security group so we can exercise behaviors with
multiple IPs.
This changeset doesn't include any Nomad configuration changes needed to take
advantage of the extra network interface. I'll include those with testing for
PR #16217.
This PR modifies the disconnect helper job to run as root, which is necesary
for manipulating iptables as it does. Also re-organizes the final test logic
to wait for client re-connect before looking for the replacement (3rd) allocation
in case that client was needed to run the alloc (also giving the sheduler more
time to do its thing).
Skips the other 3 tests, which fail and I cannot yet figure out what is going on.
In order to add an E2E test to cover token expiration, the server
config has been updated to include a low minimum allowed TTL
value. For ease of reading, the max value is also set.
Our E2E test environment is deployed with mTLS, but it's impractical
for us to use mTLS in headless browsers for automated testing (or even
in manual testing). Provide certificates for proxying the web UI via
Nginx. This proxy uses client certs for proxying to the HTTP endpoint
and a self-signed cert for the browser-facing endpoint. We can accept
certificate errors in the automated tests we'll be adding in the next
step of this work.
While working on infrastructure for testing the UI in E2E, we needed
to upgrade the certificate provider. Performing a provider upgrade via
the TF `init -upgrade` brought in updates for the file and AWS
providers as well. These updates include deprecating the use of
`sensitive_content` fields, removing CA algorithm parameters that can
be inferred from keys, and removing the requirement to manually
specify AWS assume role parameters in the provider config if they're
available in the calling environment's AWS config file (as they are
via doormat or our E2E environment).
Many of our scripts have a non-portable interpreter line for bash and
use bash-specific variables like `BASH_SOURCE`. Update the interpreter
line to be portable between various Linuxes and macOS without
complaint from posix shell users.
Concurrent E2E runs can collide when provisioning policies on HCP
Consul and HCP Vault. Namespace these by the test run name, as we do
for most everything else.
Use HCP Consul and HCP Vault for the Consul and Vault clusters used in E2E testing. This has the following benefits:
* Without the need to support mTLS bootstrapping for Consul and Vault, we can simplify the mTLS configuration by leaning on Terraform instead of janky bash shell scripting.
* Vault bootstrapping is no longer required, so we can eliminate even more janky shell scripting
* Our E2E exercises HCP, which is important to us as an organization
* With the reduction in configurability, we can simplify the Terraform configuration and drop the complicated `provision.sh`/`provision.ps1` scripts we were using previously. We can template Nomad configuration files and upload them with the `file` provisioner.
* Packer builds for Linux and Windows become much simpler.
tl;dr way less janky shell scripting!
The `Metrics` suite uses prometheus to scrape Nomad metrics so that
we're testing the full user experience of extracting metrics from
Nomad. With the addition of mTLS, we need to make sure prometheus also
has mTLS configuration because the metrics endpoint is protected.
Update the Nomad client configuration and prometheus job to bind-mount
the client's certs into the task so that the job can use these certs
to scrape the server. This is a temporary solution that gets the job
passing; we should give the job its own certificates (issued by
Vault?) when we've done some of the infrastructure rework we'd like.
This allows us to spin up e2e clusters with mTLS configured for all HashiCorp services, i.e. Nomad, Consul, and Vault. Used it for testing #11089 .
mTLS is disabled by default. I have not updated Windows provisioning scripts yet - Windows also lacks ACL support from before. I intend to follow up for them in another round.
Ease spinning up a cluster, where binaries are fetched from arbitrary
urls. These could be CircleCI `build-binaries` job artifacts, or
presigned S3 urls.
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Add a new driver capability: RemoteTasks.
When a task is run by a driver with RemoteTasks set, its TaskHandle will
be propagated to the server in its allocation's TaskState. If the task
is replaced due to a down node or draining, its TaskHandle will be
propagated to its replacement allocation.
This allows tasks to be scheduled in remote systems whose lifecycles are
disconnected from the Nomad node's lifecycle.
See https://github.com/hashicorp/nomad-driver-ecs for an example ECS
remote task driver.