Commit Graph

16 Commits

Author SHA1 Message Date
Tim Gross
072d3b6b74 cli: ensure -stale flag is respected by nomad operator debug (#11678)
When a cluster doesn't have a leader, the `nomad operator debug`
command can safely use stale queries to gracefully degrade the
consistency of almost all its queries. The query parameter for these
API calls was not being set by the command.

Some `api` package queries do not include `QueryOptions` because
they target a specific agent, but they can potentially be forwarded to
other agents. If there is no leader, these forwarded queries will
fail. Provide methods to call these APIs with `QueryOptions`.
2021-12-15 10:44:03 -05:00
Dave May
6ede4b9285 cli: refactor operator debug capture (#11466)
* debug: refactor Consul API collection
* debug: refactor Vault API collection
* debug: cleanup test timing
* debug: extend test to multiregion
* debug: save cmdline flags in bundle
* debug: add cli version to output
* Add changelog entry
2021-11-05 19:43:10 -04:00
Dave May
1d30caafad cli: rename paths in debug bundle for clarity (#11307)
* Rename folders to reflect purpose
* Improve captured files test coverage
* Rename CSI plugins output file
* Add changelog entry
* fix test and make changelog message more explicit

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2021-10-13 18:00:55 -04:00
Dave May
1bd132f09d debug: Improve namespace and region support (#11269)
* Include region and namespace in CLI output
* Add region and prefix matching for server members
* Add namespace and region API outputs to cluster metadata folder
* Add region awareness to WaitForClient helper function
* Add helper functions for SliceStringHasPrefix and StringHasPrefixInSlice
* Refactor test client agent generation
* Add tests for region
* Add changelog
2021-10-12 16:58:41 -04:00
Dave May
b430bafe90 Add remaining pprof profiles to nomad operator debug (#10748)
* Add remaining pprof profiles to debug dump
* Refactor pprof profile capture
* Add WaitForFilesUntil and WaitForResultUntil utility functions
* Add CHANGELOG entry
2021-06-21 14:22:49 -04:00
Dave May
83af6f5785 debug: update defaults to commonly used values 2021-03-09 08:31:38 -05:00
Dave May
d1648243f4 Handle Consul API URL protocol mismatch (#10082) 2021-02-25 08:22:44 -05:00
Dave May
be10568b0e Debug test refactor (#9637)
* debug: refactor test cases
* debug: remove unnecessary syncbuffer resets
* debug: cleaned up test code per suggestions
* debug: clarify note on parallel testing
2020-12-15 13:51:41 -05:00
Dave May
8038641f1b debug: Fix node count bug from GH-9566 (#9625)
* debug: update test to identify bug in GH-9566
* debug: range tests need fresh cmd each iteration
* debug: fix node count bug in GH-9566
2020-12-14 15:02:48 -05:00
Dave May
d757507341 fix AgentHostRequest panic found in GH-9546 (#9554)
* debug: refactor nodeclass test
* debug: add case to track down SIGSEGV on client to server Agent.Host RPC
* verify server to avoid panic on AgentHostRequest RPC call, fixes GH-9546
* simplify Agent.Host RPC lookup logic
2020-12-07 17:34:40 -05:00
Dave May
d8070e99b1 nomad operator debug - add pprof duration / csi details (#9346)
* debug: add pprof duration CLI argument
* debug: add CSI plugin details
* update help text with ACL requirements
* debug: provide ACL hints upon permission failures
* debug: only write file when pprof retrieve is successful
* debug: add helper function to clean bad characters from dynamic filenames
* debug: ensure files are unable to escape the capture directory
2020-12-01 12:36:05 -05:00
Dave May
205b0e7cae nomad operator debug - add client node filtering arguments (#9331)
* operator debug - add client node filtering arguments

* add WaitForClient helper function

* use RPC in WaitForClient to avoid unnecessary imports

* guard against nil values

* move initialization up and shorten test duration

* cleanup nodeLookupFailCount logic

* only display max node notice if we actually tried to capture nodes
2020-11-12 11:25:28 -05:00
Dave May
71a022ad8c Metrics gotemplate support, debug bundle features (#9067)
* add goroutine text profiles to nomad operator debug

* add server-id=all to nomad operator debug

* fix bug from changing metrics from string to []byte

* Add function to return MetricsSummary struct, metrics gotemplate support

* fix bug resolving 'server-id=all' when no servers are available

* add url to operator_debug tests

* removed test section which is used for future operator_debug.go changes

* separate metrics from operator, use only structs from go-metrics

* ensure parent directories are created as needed

* add suggested comments for text debug pprof

* move check down to where it is used

* add WaitForFiles helper function to wait for multiple files to exist

* compact metrics check

Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>

* fix github's silly apply suggestion

Co-authored-by: Drew Bailey <2614075+drewbailey@users.noreply.github.com>
2020-10-14 15:16:10 -04:00
davemay99
e0e3a01404 update deprecated syntax per GH-9027 2020-10-06 09:47:16 -04:00
davemay99
bf8bdc94f8 Add metrics command / output to debug bundle 2020-10-05 22:30:01 -04:00
Lang Martin
b5ef217c90 nomad debug renamed to nomad operator debug (#8602)
* renamed: command/debug.go -> command/operator_debug.go
* website: rename debug -> operator debug
* website/pages/api-docs/agent: name in api docs
2020-08-11 15:39:44 -04:00