Commit Graph

255 Commits

Author SHA1 Message Date
Drew Bailey
40441df3ed base podman e2e test and provisioning updates (#8104)
* initial setup for terrform to install podman task driver

podman

* Update e2e provisioning to support root podman

Excludes setup for rootless podman. updates source ami to ubuntu 18.04
Installs podman and configures podman varlink

base podman test

ensure client status running

revert terraform directory changes

* back out random go-discover go mod change

* include podman varlink docs

* address comments
2020-06-03 14:06:58 -04:00
Seth Hoenig
051e387d0c build: use hashicorp hclfmt
We have been using fatih/hclfmt which is long abandoned. Instead, switch
to HashiCorp's own hclfmt implementation. There are some trivial changes in
behavior around whitespace.
2020-05-24 18:31:57 -05:00
Tim Gross
89972866d3 e2e: upgrade CNI to 0.8.6 (#7956) 2020-05-14 09:29:11 -04:00
Seth Hoenig
fd17d7edf5 e2e: upgrade consul in packer setup to 1.7.3 from 1.6.1
There have been a number of bug fixes and features particularly around
Connect that will help us in Nomad's e2e tests. Upgrade Consul in our
packer builder so e2e can make use of the new version.
2020-05-11 11:17:28 -06:00
Seth Hoenig
e54698ed1c e2e: set an expose service check in connect e2e testcase
Make sure exposed checks work in e2e by setting an expose
check on the e2e connect test.
2020-05-07 14:40:03 -06:00
Tim Gross
36e4b74de9 e2e: csi test can purge target job (#7823) 2020-05-01 13:25:50 -04:00
Tim Gross
407e02c723 e2e: add helper to Makefile for local file deployments (#7822) 2020-04-28 16:15:58 -04:00
Tim Gross
cafdcc9216 e2e: testing reliability (#7701)
* pin CSI plugin versions
* ensure failing CSI tests clean up
* allow NOMAD_SHA env var to override makefile
2020-04-13 10:25:24 -04:00
Mahmood Ali
71d0652a21 fixup! e2e: add a convenient creation script 2020-04-09 11:04:26 -04:00
Mahmood Ali
a2875eb963 e2e: add a convenient creation script
Add a convenience Makefile for creating e2e environment for manual
debugging.
2020-04-09 10:54:30 -04:00
Lang Martin
e096820136 e2e: csi: wait for volume write claims to be released before starting read jobs (#7641) 2020-04-07 07:40:44 -04:00
Tim Gross
53b1272eff e2e: csi tests can only run on linux (#7635) 2020-04-06 11:57:59 -04:00
Tim Gross
f858d4e141 e2e/csi: add waiting for alloc stop 2020-04-06 10:15:55 -04:00
Tim Gross
2468b3853c e2e: improve test reliability for CSI (#7616)
This changeset:

* adds eval status to the error messages emitted when we have
  placement failure in tests. The implementation here isn't quite
  perfect but it's a lot better than "condition not met".
* enforces the ordering of teardown of the CSI test
* doesn't pass the purge flag to one of the two CSI tests, so that we
  exercise both code paths.
2020-04-03 15:52:58 -04:00
Tim Gross
17ffbde52c e2e: remove gometa from e2eutils (#7610) 2020-04-03 10:22:22 -04:00
Tim Gross
ab8c0e718d e2e: have TF write-out HCL for CSI volume registration (#7599) 2020-04-02 12:16:43 -04:00
Seth Hoenig
92e75ad622 e2e: minimize Consul ACL policies used in e2e tests
Issue #7523 documents the Consul ACLs used in each Consul interface
used by Nomad. Minimize the policies used in e2e tests so that we
are setting a good example.
2020-03-30 12:53:40 -06:00
Tim Gross
ccbc219609 csi: e2e tests for EBS and EFS plugins (#7343)
This changeset provides two basic e2e tests for CSI plugins targeting
common AWS use cases.

The EBS test launches the EBS plugin (controller + nodes) and registers
an EBS volume as a Nomad CSI volume. We deploy a job that writes to
the volume, stop that job, and reuse the volume for another job which
should be able to read the data written by the first job.

The EFS test launches the EFS plugin (nodes-only) and registers an EFS
volume as a Nomad CSI volume. We deploy a job that writes to the
volume, stop that job, and reuse the volume for another job which
should be able to read the data written by the first job.

The writer jobs mount the CSI volume at a location within the alloc
dir.
2020-03-23 13:59:18 -04:00
Mahmood Ali
c290a97069 e2e: use unique CSI token
Use a unique per-cluster efs creation token, as https://www.terraform.io/docs/providers/aws/r/efs_file_system.html#creation_token.

Using a static value prevents having multiple test clusters.

[ci skip]
2020-03-15 21:55:26 -04:00
Tim Gross
951fb027a0 e2e: add EBS and EFS volumes for testing CSI (#7266)
This changeset adds volumes but does not mount them to instances so
that we can test the mounting ("staging") via CSI plugins. The CSI
plugins themselves will be installed as Nomad jobs.

In order to ensure we can always mount the EFS volume, this changeset
pins the deployment of the cluster to a specific subnet. In future
work we should spread the cluster out among several AZs and test that
behavior explicitly.
2020-03-04 10:44:51 -05:00
Mahmood Ali
88ab2c8d7a e2e: avoid parsing Args in pkg init
Golang 1.13 introduced a change in test flag parsing:

> testing
> ...
> Testing flags are now registered in the new Init function, which is invoked by the generated main function for the test. As a result, testing flags are now only registered when running a test binary, and packages that call flag.Parse during package initialization may cause tests to fail.

https://golang.org/doc/go1.13#testing

Here, we ensure that e2e framework parsing occur in TestMain, by only
initializing Framework at Run invocation.
2020-03-02 14:13:54 -05:00
Michael Schurter
eecc8600b2 test: explicitly pass vars vs enclosing them 2020-02-14 11:10:33 -08:00
Michael Schurter
69bee220cc test: remove errgroup to take advantage of vet
go vet would have prevented the bug fixed in
6362e32161 but our use of errgroup
prevented that.

Rip out errgroup to take advantage of vet, and remove download limiting
now that we're downloading far fewer binaries overall.
2020-02-14 10:53:54 -08:00
Michael Schurter
63791d645c test: sort vault tests by version 2020-02-14 10:33:17 -08:00
Michael Schurter
6362e32161 test: capture url to fix flaky test 2020-02-14 10:32:58 -08:00
Michael Schurter
e4832653b3 test: only test latest Z of each X.Y.Z release 2020-02-14 08:41:45 -08:00
Michael Schurter
3a01ad4892 Merge pull request #7102 from hashicorp/test-limits
Fix some race conditions and flaky tests
2020-02-13 10:19:11 -08:00
Michael Schurter
1ef1889c0f test: simplify code 2020-02-07 15:50:53 -08:00
Tim Gross
8d173664cb e2e: add --quiet flag to s3 copy to reduce log spam (#7085) 2020-02-06 09:24:20 -05:00
Seth Hoenig
729e0c20a5 Merge pull request #7071 from hashicorp/b-e2e-cacls-wait-longer
e2e: wait 2m rather than 10s after disabling consul acls
2020-02-04 14:05:10 -06:00
Drew Bailey
84cc906968 simplify job, better error 2020-02-04 13:59:39 -05:00
Drew Bailey
8bf5016880 fix check 2020-02-04 12:16:20 -05:00
Drew Bailey
39c9c20e88 rm unused field 2020-02-04 12:02:01 -05:00
Drew Bailey
3609e3adc1 clean up 2020-02-04 11:59:28 -05:00
Drew Bailey
5c2075e463 get test passing, new util func to wait for not pending 2020-02-04 11:56:37 -05:00
Drew Bailey
756f5c7d79 add e2e test for system sched ineligible nodes 2020-02-04 11:56:33 -05:00
Seth Hoenig
0f2d9ea915 e2e: wait 2m rather than 10s after disabling consul acls
Pretty sure Consul / Nomad clients are often not ready yet after
the ConsulACLs test disables ACLs, by the time the next test starts
running.

Running locally things tend to work, but in TeamCity this seems to
be a recurring problem. However, when running locally sometimes I do
see that the "show status" step after disabling ACLs, some nodes are
still initializing, suggesting we're right on the border of not waiting
long enough

    nomad node status
    ID        DC   Name              Class   Drain  Eligibility  Status
    0e4dfce2  dc1  EC2AMAZ-JB3NF9P   <none>  false  eligible     ready
    6b90aa06  dc2  ip-172-31-16-225  <none>  false  eligible     ready
    7068558a  dc2  ip-172-31-20-143  <none>  false  eligible     ready
    e0ae3c5c  dc1  ip-172-31-25-165  <none>  false  eligible     ready
    15b59ed6  dc1  ip-172-31-23-199  <none>  false  eligible     initializing

Going to try waiting a full 2 minutes after disabling ACLs, hopefully that
will help things Just Work. In the future, we should probably be parsing the
output of the status checks and actually confirming all nodes are ready.

Even better, maybe that's something shipyard will have built-in.
2020-02-04 10:51:03 -06:00
Tim Gross
ed41d7b590 e2e: rename linux runner to avoid implicit build tag (#7070)
Go implicitly treats files ending with `_linux.go` as build tagged for
Linux only. This broke the e2e provisioning framework on macOS once we
tried importing it into the `e2e/consulacls` module.
2020-02-04 10:55:38 -05:00
Tim Gross
15a2acc741 e2e: improve provisioning defaults and documentation (#7062)
This changeset improves the ergonomics of running the Nomad e2e test
provisioning process by defaulting to a blank `nomad_sha` in the
Terraform configuration. By default, a user will now need to pass in
one of the Nomad version flags. But they won't have to manually edit
the `provisioning.json` file for the common case of deploying a
released version of Nomad, and won't need to put dummy values for
`nomad_sha`.

Includes general documentation improvements.
2020-02-04 10:37:00 -05:00
Seth Hoenig
a2ee80402d e2e: turn no-ACLs connect tests back on
Also cleanup more missed debugging things >.>
2020-02-03 20:46:36 -06:00
Mahmood Ali
c7eb60bbac Merge pull request #7055 from hashicorp/r-dev-tweaks-20200203
Grab bag of dev tweaks
2020-02-03 14:25:06 -05:00
Mahmood Ali
2e0f98c97a run "make hclfmt" 2020-02-03 12:15:53 -05:00
Seth Hoenig
9ccaa92ba1 e2e: remove leftover debug println statement 2020-02-03 11:15:38 -06:00
Seth Hoenig
04b526662c e2e: setup consul ACLs a little more correctly 2020-01-31 19:06:11 -06:00
Seth Hoenig
b0e3acac37 e2e: remove redundant extra API call for getting allocs 2020-01-31 19:06:07 -06:00
Seth Hoenig
9fa02763ac e2e: agent token was only being set for server0 2020-01-31 19:06:03 -06:00
Seth Hoenig
8372bfbc8b e2e: use hclfmt on consul acls policy config files 2020-01-31 19:05:59 -06:00
Seth Hoenig
390a7f1c24 e2e: uncomment test case that is not broken 2020-01-31 19:05:55 -06:00
Seth Hoenig
d252bb4e80 e2e: do not use eventually when waiting for allocs
This test is causing panics. Unlike the other similar tests, this
one is using require.Eventually which is doing something bad, and
this change replaces it with a for-loop like the other tests.

Failure:

=== RUN   TestE2E/Connect
=== RUN   TestE2E/Connect/*connect.ConnectE2ETest
=== RUN   TestE2E/Connect/*connect.ConnectE2ETest/TestConnectDemo
=== RUN   TestE2E/Connect/*connect.ConnectE2ETest/TestMultiServiceConnect
=== RUN   TestE2E/Connect/*connect.ConnectClientStateE2ETest
panic: Fail in goroutine after TestE2E/Connect/*connect.ConnectE2ETest has completed

goroutine 38 [running]:
testing.(*common).Fail(0xc000656500)
	/opt/google/go/src/testing/testing.go:565 +0x11e
testing.(*common).Fail(0xc000656100)
	/opt/google/go/src/testing/testing.go:559 +0x96
testing.(*common).FailNow(0xc000656100)
	/opt/google/go/src/testing/testing.go:587 +0x2b
testing.(*common).Fatalf(0xc000656100, 0x1512f90, 0x10, 0xc000675f88, 0x1, 0x1)
	/opt/google/go/src/testing/testing.go:672 +0x91
github.com/hashicorp/nomad/e2e/connect.(*ConnectE2ETest).TestMultiServiceConnect.func1(0x0)
	/home/shoenig/go/src/github.com/hashicorp/nomad/e2e/connect/multi_service.go:72 +0x296
github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert.Eventually.func1(0xc0004962a0, 0xc0002338f0)
	/home/shoenig/go/src/github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert/assertions.go:1494 +0x27
created by github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert.Eventually
	/home/shoenig/go/src/github.com/hashicorp/nomad/vendor/github.com/stretchr/testify/assert/assertions.go:1493 +0x272
FAIL	github.com/hashicorp/nomad/e2e	21.427s
2020-01-31 19:05:47 -06:00
Seth Hoenig
1c9500ab27 e2e: remove forgotten unused field from new struct 2020-01-31 19:05:41 -06:00