Commit Graph

344 Commits

Author SHA1 Message Date
Drew Bailey
057241d7ae enables audit log on full-cluster (#9315) 2020-11-11 08:33:01 -05:00
Tim Gross
3854916a72 e2e: Windows provisioning improvements (#9246)
Small changes to the Windows 2016 Packer build for debuggability of
provisioning:

* improve verbosity of powershell error handling
* remove unused "tools" installation
* use ssh communicator for Packer to improve Packer build times and eliminate
  deprecated winrm remote access (unavailable from current macOS)
2020-11-09 13:29:40 -05:00
Drew Bailey
5718115938 append custom path to custom_config_files (#9289)
* append custom path to custom_config_files

* remove config_path variable
2020-11-06 11:16:13 -05:00
Tim Gross
d6255129a3 E2E: switch packer build files to HCL2 (#9219)
Build configuration files need comments, and JSON is also just the worst, isn't it?
Upgrade our E2E packer configs to use the new HCL2 syntax.
2020-10-29 10:03:39 -04:00
Tim Gross
0c14f9e610 e2e: provide precedence for version variables (#9216)
The `nomad_sha`, `nomad_version`, and `nomad_local_binary` variables for the
Nomad provisioning module assumed that only one would be set. By having the
override each other with an explicit precedence, it makes it easier to avoid
problems with Terraform's implicit variables behavior.

Set the expected default values in the `terraform.full.tfvars` to avoid
shadowing by any future changes to the `terraform.tfvars` file.

Update the Makefile to put the `-var` and `-var-file` in the correct order.
2020-10-29 09:15:22 -04:00
Tim Gross
fa812aa516 E2E: AMI software version bumps and cleanup (#9213)
* remove unused vault installation from Windows AMI
* match Windows and Linux Consul versions
* bump AMI base Nomad to current stable
2020-10-29 08:27:50 -04:00
Tim Gross
8a267735f1 e2e: set default version for dev cluster (#9208) 2020-10-28 16:50:20 -04:00
Tim Gross
ac49cb2127 e2e: reduce risk of flaky Ubuntu AMI build (#9207)
The base Ubuntu AMI modifies apt sources during cloud-init. But the Packer
build can potentially start the setup script before that work is done,
resulting in errors trying to install base system dependencies like
`dnsmasq`. Delay the setup long enough to lose the race with cloud-init.
2020-10-28 15:13:44 -04:00
Tim Gross
38e23b62a7 e2e: use more specific names for OS/distros (#9204)
We intend to expand the nightly E2E test to cover multiple distros and
platforms. Change the naming structure for "Linux client" to the more precise
"Ubuntu Bionic", and "Windows" to "Windows 2016" to make it easier to add new
targets without additional refactoring.
2020-10-28 12:58:00 -04:00
Tim Gross
595b45095d e2e: make dev cluster the default Terraform vars file (#9202)
Most of the time that a human is running the TF provisioning, they want the
"dev cluster" which is going to deploy an OSS sha, with fewer targets and
configuration alternatives. But the default `terraform.tfvars` is the nightly
E2E run. Because the nightly run is automated, there's no reason we can't have
it pick a non-default `terraform.full.tfvars` file and have the default be the
dev cluster.
2020-10-28 10:01:42 -04:00
Tim Gross
9e9dac4a2c Revert "e2e: fix destination of templates in VaultSecrets test (#9146)" (#9163)
This reverts commit 8aed53c177.
2020-10-23 09:01:25 -04:00
Tim Gross
8a90b7eb16 artifact/template: make destination path absolute inside taskdir (#9149)
Prior to Nomad 0.12.5, you could use `${NOMAD_SECRETS_DIR}/mysecret.txt` as
the `artifact.destination` and `template.destination` because we would always
append the destination to the task working directory. In the recent security
patch we treated the `destination` absolute path as valid if it didn't escape
the working directory, but this breaks backwards compatibility and
interpolation of `destination` fields.

This changeset partially reverts the behavior so that we always append the
destination, but we also perform the escape check on that new destination
after interpolation so the security hole is closed.

Also, ConsulTemplate test should exercise interpolation
2020-10-22 15:47:49 -04:00
Tim Gross
8aed53c177 e2e: fix destination of templates in VaultSecrets test (#9146)
The `$NOMAD_SECRETS_DIR` environment variable is rendered as `/secrets`, which
prior to the recent security patch would unintentionally escape the file
sandbox and get dropped in a directory named `/secrets` where the Nomad client
binary was running. The `VaultSecrets` test was accidentally relying on this
behavior and that causes the test to fail.
2020-10-22 13:00:08 -04:00
Tim Gross
d552b4e8d1 e2e: path fixes for local_binary uploads (#9137)
When uploading a local binary for provisioning, the location that we pass into
the provisioning script needs to be where we uploaded it to, not the source on
our laptop. Also, the null_resource for uploading needs to read in the private
key, not its path.
2020-10-21 10:20:22 -04:00
Drew Bailey
659943a18b adds two base event stream e2e tests (#9126)
* adds two base event stream e2e tests

test evaluation filter keys are included

* Apply suggestions from code review

Co-authored-by: Tim Gross <tgross@hashicorp.com>

* gc aftereach

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2020-10-20 08:26:21 -04:00
Tim Gross
2b11e52fe9 e2e: add reporting to flaky spread test (#9115)
The spread test is infrequently flaky and it's hard to extract what's actually
happening. If the test fails, dump all the allocation metrics so that we can
debug the behavior.
2020-10-16 11:01:07 -04:00
Tim Gross
3e827ccb43 e2e: fix flaky TaskEventsTest (#9114)
Assert that we get at least N task events, rather than exactly N. When a
task within an allocation dies, a sibling task can get an Allocation Unhealthy
event after it's also killed, even though it's not the origin of the event.
2020-10-16 10:22:40 -04:00
Tim Gross
c47815ae8e e2e: networking test job needs to outlast assert (#9113)
The `e2ejob` utility asserts that a job is running for 5s, but with a sleep
time of 5s, the networking job can race with that check. Sleeping for a longer
period should guarantee that we're running long enough to pass the assert.

Also constrains the job to Linux because our Windows test targets don't yet
support Docker (LCOW), and expand the set of DCs we can safely land on.
2020-10-16 10:13:16 -04:00
Chris Baker
38171b0dd6 Merge pull request #9089 from hashicorp/b-explicit-rune
fix go 1.15 pickiness
2020-10-14 10:37:36 -05:00
Tim Gross
b63d3ffc54 e2e: eliminate race condition causing rescheduling test flake (#9085)
The autorevert test checks for reverted allocations to be placed and running
before checking the deployment status, but the deployment can be completed and
marked "successful" before we check it for "running" status. Instead, just
wait for it to be marked "successful" and assert we have the expected count of
deployment statuses.
2020-10-14 11:35:30 -04:00
Tim Gross
554bdcc0ca e2e: use AMI filter for Ubuntu packer image (#9086)
Instead of hard-coding the base AMI for our Packer image for Ubuntu, use the
latest from Canonical so that we always have their current kernel patches.
2020-10-14 11:22:33 -04:00
Chris Baker
3245adb807 fix go 1.15 pickiness 2020-10-14 15:19:54 +00:00
Nick Ethier
5ddb88227e e2e/networking: use correct dc (#9088) 2020-10-14 11:14:09 -04:00
Tim Gross
b8cd187a41 e2e: add flag to opt-in to creating EBS/EFS volumes (#9082)
For everyday developer use, we don't need volumes for testing CSI. Providing a
flag to opt-in speeds up deploying dev clusters and slightly reduces infra costs.

Skip CSI test if missing volume specs.
2020-10-14 10:29:33 -04:00
Tim Gross
4314e81e78 E2E: vault secrets (#9081)
* rename vault API compatibility test for clarity
* exercise vault secrets lease renewal
2020-10-14 08:43:28 -04:00
Nick Ethier
756aa11654 client: add NetworkStatus to Allocation (#8657) 2020-10-12 13:43:04 -04:00
Yoan Blanc
c14c616194 use allow/deny instead of the colored alternatives (#9019)
Signed-off-by: Yoan Blanc <yoan@dosimple.ch>
2020-10-12 08:47:05 -04:00
Tim Gross
31747e9c44 e2e: extend ConsulTemplate test and fix flakiness (#8997)
Add service discovery integration to the existing consul-template E2E test,
and verify both service and key updates force re-rendering. Fixes flakiness by
using the longer default wait config we use elsewhere.

Removes our last direct dependency on gomega.
2020-10-05 10:51:55 -04:00
Tim Gross
5f87acf6cf e2e: bootstrap vault and provision Nomad with vault tokens (#9010)
Provisions vault with the policies described in the Nomad Vault integration
guide, and drops a configuration file for Nomad vault server configuration
with its token. The vault root token is exposed to the E2E runner so that
tests can write additional policies to vault.
2020-10-05 09:28:37 -04:00
Tim Gross
966d47da46 e2e: tfvars.dev file must override default tfvars file (#9005)
The `-var-file` flag for loading variables into Terraform overlays the default
variables file if present. This means that variables that are set in the
default variables file will take precedence if the overlay file does not have
them set.

Set `nomad_acls` and `nomad_enteprise` to `false` in the dev cluster.
2020-10-02 08:02:37 -04:00
Tim Gross
5621407cef e2e: ensure tests are constrained to Linux (#8990)
Until we have LCOW support in the E2E environment (which requires a Windows
2019 test target), we need to constrain E2E tests to the appropriate kernel
2020-09-30 09:43:30 -04:00
Tim Gross
8cf583fe48 e2e: cleanup errors should use assert, not require (#8989)
The E2E framework wraps testify's `require` so that by default we can stop
tests on errors, but the cleanup functions should use `assert` so that we
continue to try to cleanup the test environment even if there's a failure.
2020-09-30 09:00:37 -04:00
Tim Gross
dd0333c759 e2e: rework rescheduling progress deadline test (#8958)
Eliminate sources of randomness in the progress deadline test and clarify the
purpose of the test to check for progress deadline updates.
2020-09-29 11:02:16 -04:00
Tim Gross
c9d5048e2f e2e: namespace support for CLI helpers (#8978)
Required to support tests for namespaces and other ENT features.
2020-09-28 16:37:34 -04:00
Tim Gross
181f30d33e e2e: ENT placeholder for namespace/quotas tests (#8973) 2020-09-28 11:23:37 -04:00
Tim Gross
83a569dfc0 e2e: test for host volumes and Docker volumes (#8972)
Exercises host volume and Docker volume functionality for the `exec` and `docker`
task driver, particularly around mounting locations within the container and
how this can be used with `template`.
2020-09-28 11:14:13 -04:00
Tim Gross
7b00a118f5 e2e: add flag to bootstrap Nomad ACLs (#8961)
Adds a `nomad_acls` flag to our Terraform stack that bootstraps Nomad ACLs via
a `local-exec` provider. There's no way to set the `NOMAD_TOKEN` in the Nomad
TF provider if we're bootstrapping in the same Terraform stack, so instead of
using `resource.nomad_acl_token`, we also bootstrap a wide-open anonymous
policy. The resulting management token is exported as an environment var with
`$(terraform output environment)` and tests that want stricter ACLs will be
able to write them using that token.

This should also provide a basis to do similar work with Consul ACLs in the
future.
2020-09-28 09:22:36 -04:00
Tim Gross
932a340410 e2e: remove unused migrations test (#8955)
The areas of the code this test exercised were merged in with the node
drain tests.
2020-09-23 14:50:15 -04:00
Tim Gross
b491b4a7de e2e: use more recent instance type (#8954)
Newer EC2 instances are both cheaper and have generally better
performance.

The dnsmasq configuration had a hard-coded interface name, so in order to
accomodate instances with more recent networking that result in so-called
predictable interface names, the dnsmasq configuration needs to be replaced at
runtime with userdata to select the default interface.
2020-09-23 14:27:52 -04:00
Tim Gross
926cebce0e e2e: add flags for provisioning Nomad Enterprise (#8929) 2020-09-23 10:39:04 -04:00
Tim Gross
2c71452234 e2e: node drain tests (#8906)
Exercise the `nomad node drain` features, driving them via the new CLI helpers.
2020-09-21 11:52:11 -04:00
Tim Gross
e9f8ad737e e2e: reschedule tests should check for non-zero rescheduled allocs (#8927)
The conditional around some of the rescheduling tests was backwards, where we
were waiting for allocations to be rescheduled but testing for a count of
0. The test was passing but flaky because if the check happened quickly enough
before the scheduler rescheduled the allocations, it would pass.
2020-09-21 08:17:24 -04:00
Tim Gross
fb170f37a0 make sure dev-cluster has the option to run windows config (#8928) 2020-09-18 16:41:35 -04:00
Tim Gross
daf9ed3f9c e2e: remove unused framework provisioning code (#8908) 2020-09-18 11:46:47 -04:00
Tim Gross
6914be56f5 e2e: test script for Terraform logic (#8907) 2020-09-18 11:46:40 -04:00
Tim Gross
ad2ca7385c e2e: provision cluster entirely through Terraform (#8748)
Have Terraform run the target-specific `provision.sh`/`provision.ps1` script
rather than the test runner code which needs to be customized for each
distro. Use Terraform's detection of variable value changes so that we can
re-run the provisioning without having to re-install Nomad on those specific
hosts that need it changed.

Allow the configuration "profile" (well-known directory) to be set by a
Terraform variable. The default configurations are installed during Packer
build time, and symlinked into the live configuration directory by the
provision script. Detect changes in the file contents so that we only upload
custom configuration files that have changed between Terraform runs
2020-09-18 11:27:24 -04:00
Tim Gross
6a9f322a31 e2e: documentation and minor tweaks to configs (#8912)
* remove outdated references to envchain in documentation
* add new host volume locations in userdata
* don't exit the entire script during provisioning, just return
2020-09-17 09:20:18 -04:00
Tim Gross
2ec1eb4ec6 e2e: refactor CLI utils out of rescheduling test (#8905)
The CLI helpers in the rescheduling test were intended for shared use, but
until some other tests were written we didn't want to waste time making them
generic. This changeset refactors them and adds some new helpers associated
with the node drain tests (under separate PR).
2020-09-16 16:10:06 -04:00
Tim Gross
04ee35b13f e2e: constrain rescheduling test workloads to Linux (#8872)
The rescheduling test workloads were created before we had Windows targets in
the E2E nightly run. When these were recently ported to the e2e framework they
were missing the constraint to Linux machines.

Also added a little extra time to polling to avoid some flakiness on the first
run, and a minor readability adjustment to the job names.
2020-09-11 09:21:28 -04:00
Tim Gross
ec2f1ecaa1 Merge pull request #8860 E2E: rescheduling tests 2020-09-10 13:43:55 -04:00