Commit Graph

84 Commits

Author SHA1 Message Date
Michael Schurter
eeb1da8a2e test: update tests to properly use AllocDir
Also use t.TempDir when possible.
2021-10-19 10:49:07 -07:00
Dave May
1d30caafad cli: rename paths in debug bundle for clarity (#11307)
* Rename folders to reflect purpose
* Improve captured files test coverage
* Rename CSI plugins output file
* Add changelog entry
* fix test and make changelog message more explicit

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2021-10-13 18:00:55 -04:00
Dave May
1bd132f09d debug: Improve namespace and region support (#11269)
* Include region and namespace in CLI output
* Add region and prefix matching for server members
* Add namespace and region API outputs to cluster metadata folder
* Add region awareness to WaitForClient helper function
* Add helper functions for SliceStringHasPrefix and StringHasPrefixInSlice
* Refactor test client agent generation
* Add tests for region
* Add changelog
2021-10-12 16:58:41 -04:00
Dave May
b430bafe90 Add remaining pprof profiles to nomad operator debug (#10748)
* Add remaining pprof profiles to debug dump
* Refactor pprof profile capture
* Add WaitForFilesUntil and WaitForResultUntil utility functions
* Add CHANGELOG entry
2021-06-21 14:22:49 -04:00
Mahmood Ali
122a4cb844 tests: use standard library testing.TB
Glint pulled in an updated version of mitchellh/go-testing-interface
which broke some existing tests because the update added a Parallel()
method to testing.T. This switches to the standard library testing.TB
which doesn't have a Parallel() method.
2021-06-09 16:18:45 -07:00
Charlie Voiselle
d914990e5f Fixup uses of sanity (#10187)
* Fixup uses of `sanity`
* Remove unnecessary comments.

These checks are better explained by earlier comments about
the context of the test. Per @tgross, moved the tests together
to better reinforce the overall shared context.

* Update nomad/fsm_test.go
2021-03-16 18:05:08 -04:00
Dennis Schön
582d3b7092 use os.ErrDeadlineExceeded in tests 2020-12-07 10:40:28 -05:00
Dave May
205b0e7cae nomad operator debug - add client node filtering arguments (#9331)
* operator debug - add client node filtering arguments

* add WaitForClient helper function

* use RPC in WaitForClient to avoid unnecessary imports

* guard against nil values

* move initialization up and shorten test duration

* cleanup nodeLookupFailCount logic

* only display max node notice if we actually tried to capture nodes
2020-11-12 11:25:28 -05:00
Dave May
71a022ad8c Metrics gotemplate support, debug bundle features (#9067)
* add goroutine text profiles to nomad operator debug

* add server-id=all to nomad operator debug

* fix bug from changing metrics from string to []byte

* Add function to return MetricsSummary struct, metrics gotemplate support

* fix bug resolving 'server-id=all' when no servers are available

* add url to operator_debug tests

* removed test section which is used for future operator_debug.go changes

* separate metrics from operator, use only structs from go-metrics

* ensure parent directories are created as needed

* add suggested comments for text debug pprof

* move check down to where it is used

* add WaitForFiles helper function to wait for multiple files to exist

* compact metrics check

Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>

* fix github's silly apply suggestion

Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>
2020-10-14 15:16:10 -04:00
Mahmood Ali
bcc4ec910d gracefully shutdown test server 2020-05-27 08:59:06 -04:00
Mahmood Ali
96a0938b0e tests: wait until leadership loop finishes
Reverts d5c7d6e491 .

We actually need to forward the request to ensure that the leader is
properly configured and that establishedLeadership completes.
2020-03-06 14:41:59 -05:00
Mahmood Ali
d5c7d6e491 avoid forwarding leadership checks in tests
The tests only care if a test server recognizes the leader.
2020-03-02 13:47:43 -05:00
Michael Schurter
e3e1f5cb53 core: add limits to unauthorized connections
Introduce limits to prevent unauthorized users from exhausting all
ephemeral ports on agents:

 * `{https,rpc}_handshake_timeout`
 * `{http,rpc}_max_conns_per_client`

The handshake timeout closes connections that have not completed the TLS
handshake by the deadline (5s by default). For RPC connections this
timeout also separately applies to first byte being read so RPC
connections with TLS enabled have `rpc_handshake_time * 2` as their
deadline.

The connection limit per client prevents a single remote TCP peer from
exhausting all ephemeral ports. The default is 100, but can be lowered
to a minimum of 26. Since streaming RPC connections create a new TCP
connection (until MultiplexV2 is used), 20 connections are reserved for
Raft and non-streaming RPCs to prevent connection exhaustion due to
streaming RPCs.

All limits are configurable and may be disabled by setting them to `0`.

This also includes a fix that closes connections that attempt to create
TLS RPC connections recursively. While only users with valid mTLS
certificates could perform such an operation, it was added as a
safeguard to prevent programming errors before they could cause resource
exhaustion.
2020-01-30 10:38:25 -08:00
Seth Hoenig
94c60b4cfa tests: swap lib/freeport for tweaked helper/freeport
Copy the updated version of freeport (sdk/freeport), and tweak it for use
in Nomad tests. This means staying below port 10000 to avoid conflicts with
the lib/freeport that is still transitively used by the old version of
consul that we vendor. Also provide implementations to find ephemeral ports
of macOS and Windows environments.

Ports acquired through freeport are supposed to be returned to freeport,
which this change now also introduces. Many tests are modified to include
calls to a cleanup function for Server objects.

This should help quite a bit with some flakey tests, but not all of them.
Our port problems will not go away completely until we upgrade our vendor
version of consul. With Go modules, we'll probably do a 'replace' to swap
out other copies of freeport with the one now in 'nomad/helper/freeport'.
2019-12-09 08:37:32 -06:00
Mahmood Ali
7a38784244 acl: check ACL against object namespace
Fix a bug where a millicious user can access or manipulate an alloc in a
namespace they don't have access to.  The allocation endpoints perform
ACL checks against the request namespace, not the allocation namespace,
and performs the allocation lookup independently from namespaces.

Here, we check that the requested can access the alloc namespace
regardless of the declared request namespace.

Ideally, we'd enforce that the declared request namespace matches
the actual allocation namespace.  Unfortunately, we haven't documented
alloc endpoints as namespaced functions; we suspect starting to enforce
this will be very disruptive and inappropriate for a nomad point
release.  As such, we maintain current behavior that doesn't require
passing the proper namespace in request.  A future major release may
start enforcing checking declared namespace.
2019-10-08 12:59:22 -04:00
Mahmood Ali
b748d4d714 tests: attempt to fix TestAutopilot_CleanupStaleRaftServer
Also add a utility function for waiting for stable leadership
2019-09-04 08:49:33 -04:00
Mahmood Ali
75349d647e test helper for registering jobs with acl
Test helper that allows registration of jobs when ACL is activated.
2019-04-30 10:23:56 -04:00
Mahmood Ali
d6250ec0d6 tests: IsTravis() -> IsCI()
Replace IsTravis() references that is intended for more CI environments
rather than for Travis environment specifically.
2019-02-20 08:21:03 -05:00
Mahmood Ali
2af30fb441 tests: expect Docker on AppVeyor
Prepare to run docker on AppVeyor Windows environment
2019-02-20 07:41:47 -05:00
Alex Dadgar
95297c608c goimports 2019-01-22 15:44:31 -08:00
Danielle Tomlinson
3287c3f019 chore: General Cleanup 2019-01-17 18:43:14 +01:00
Michael Schurter
f40d243cbd Update testutil/vault.go
Co-Authored-By: dantoml <dani@tomlinson.io>
2019-01-17 18:43:14 +01:00
Danielle Tomlinson
d81e632157 testutil: Start vault in the same routine as waiting
This is a workaround for the windows process model.

Go os/exec does not pass the parent process handle to the child
processes STARTUPINFO struct, this means that unless we wait in
the _same_ execution context as Starting the process, the
handle will be lost, and we cannot kill it without regaining
a handle.

A better long term solution would be a higher level process
abstraction that uses windows.CreateProcess on windows.
2019-01-17 18:43:13 +01:00
Danielle Tomlinson
d33809b886 fingerprint: Limit vault shutdown waiting
When vault is installed through chocolatey, it also installs a shim that
will not pass kill signals to the child. This means the process will
never actually terminate, and we lose the process handle.

Here, rather than waiting forever, we timeout fast.
2019-01-17 18:43:13 +01:00
Mahmood Ali
cfb68457bc tests: WaitForRunning checks for pending only
WaitForRunning risks a race condition where the allocation succeeds and
completes before WaitForRunning is called (or while it is running).

Here, I made the behavior match the function documentation.

I considered making it stricter, but callers need to account for
allocation terminating immediately after WaitForRunning terminates
anyway.
2019-01-10 15:36:57 -05:00
Mahmood Ali
1b7b70f47e tests: enable and fix tests requiring mock driver 2019-01-10 10:10:11 -05:00
Michael Schurter
4d1a1ac5bb tests: test logs endpoint against pending task
Although the really exciting change is making WaitForRunning return the
allocations that it started. This should cut down test boilerplate
significantly.
2018-10-16 16:56:55 -07:00
Michael Schurter
334f2b496e tests: fix races caused by sharing a buffer
httptest.ResponseRecorder exposes a bytes.Buffer which we were reading
and writing concurrently to test streaming log APIs. This is a race, so
I wrapped the struct in a lock with some helpers.
2018-10-16 16:56:55 -07:00
Alex Dadgar
0ee63c43ea add a vault test matrix 2018-09-19 10:18:10 -07:00
Michael Schurter
c55d166712 client: set host name when migrating over tls
Not setting the host name led the Go HTTP client to expect a certificate
with a DNS-resolvable name. Since Nomad uses `${role}.${region}.nomad`
names ephemeral dir migrations were broken when TLS was enabled.

Added an e2e test to ensure this doesn't break again as it's very
difficult to test and the TLS configuration is very easy to get wrong.
2018-09-05 17:24:17 -07:00
Michael Schurter
29da24b77b test: build with mock_driver by default
`make release` and `make prerelease` set a `release` tag to disable
enabling the `mock_driver`
2018-04-18 14:45:33 -07:00
Alex Dadgar
69bb648b42 Merge pull request #4105 from hashicorp/b-flaky-deadline-tests
Fix flaky deadline tests
2018-04-03 17:21:38 -07:00
Alex Dadgar
4cfe3a5043 Fix flaky deadline tests 2018-04-03 16:51:57 -07:00
Oz Katz
3e743b5f2e Support custom Consul config for TestServer
Adds a Consul field to the TestServerConfig that allows passing in non-default values for e.g. consul address.
This will allow the TestServer to integrate with Consul's testutil/TestServer.
2018-04-04 01:40:11 +03:00
Alex Dadgar
3099ef05e2 Unmark drain when nodes hit their deadline and only batch/system left and add all job type integration test 2018-03-28 17:25:58 -07:00
Alex Dadgar
71a823b70b Remove fake advertise address and fix TestAPI_OperatorAutopilotServerHealth 2018-03-19 15:49:12 -07:00
Michael Schurter
60c98e4356 Skip QEMU graceful shutdown test except on Travis
Hopefully we can reuse the SkipSlow helper elsewhere.
2018-01-31 15:47:26 -08:00
Kyle Havlovitz
c2d0c11f9e Add autopilot functionality based on Consul's autopilot 2017-12-18 14:29:41 -08:00
Alex Dadgar
8accabcd87 move to consul freeport implementation 2017-10-23 16:51:40 -07:00
Alex Dadgar
e1b1465081 Standardize retrieving a free port into a helper package 2017-10-23 16:48:20 -07:00
Michael Schurter
04b8f8e7fc Remove structs import from api
Goes a step further and removes structs import from api's tests as well
by moving GenerateUUID to its own package.
2017-09-29 10:36:08 -07:00
Chelsea Holland Komlo
fedf2d3314 fix access for health check 2017-09-15 22:57:31 +00:00
Armon Dadgar
49f9f0e26b testutil: Allow enabling ACLs 2017-09-04 13:07:44 -07:00
Alex Dadgar
9415ef5b47 Use LookupPath to add the exe extension 2017-07-25 17:42:36 -07:00
Alex Dadgar
8c9234e319 Make test Vault pick random ports 2017-07-25 17:40:59 -07:00
Michael Schurter
55bd0c8df7 Use go-testing-interface instead of testing
This drops the testings stdlib pkg from our dependencies. Saves a
whopping 46kb on our binary (was really hoping for more of a win there),
but also avoids potential ugliness with how testing sets flags.
2017-07-25 15:35:19 -07:00
Alex Dadgar
08c2ba9bc6 Parallel client tests (#2890)
* alloc_runner

* Random tests

* parallel task_runner and no exec compatible check

* Parallel client

* Fail fast and use random ports

* Fix docker port mapping

* Make concurrent pull less timing dependant

* up parallel

* Fixes

* don't build chroots in parallel on travis

* Reduce parallelism on travis with lxc/rkt

* make java test app not run forever

* drop parallelism a little

* use docker ports that are out of the os's ephemeral port range

* Limit even more on travis

* rkt deadline
2017-07-22 19:04:36 -07:00
Alex Dadgar
a56f67b76d Parallel 2017-07-21 16:33:04 -07:00
Alex Dadgar
d72e86cbe3 Use a lock to increment the vault port 2017-05-16 13:25:07 -04:00
Michael Schurter
df30a110fb Create AssertUntil helper func 2017-04-06 17:05:09 -07:00