Commit Graph

24783 Commits

Author SHA1 Message Date
Luiz Aoqui
13ee343853 core: remove unnecessary call to SetNodes and adds DC downgrade test (#17655) 2023-06-22 13:26:14 -04:00
Jai
4da63e3ded ui: create node pool model (#17301)
Co-authored-by: Phil Renaud <phil@riotindustries.com>
Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-06-22 13:11:44 -04:00
Luiz Aoqui
085a0b0883 np: check for license on RPC endpoints (#17656) 2023-06-22 12:52:20 -04:00
Luiz Aoqui
717e1567bb ci: set continue-on-error: true on test-ui (#17646)
Since the matrix exercises different test cases, it's better to allow
all partitions to completely run, even if one of them fails, so it's
easier to catch multiple test failures.
2023-06-22 11:31:49 -04:00
Tim Gross
deae9bb62e client: send node secret with every client-to-server RPC (#16799)
In Nomad 1.5.3 we fixed a security bug that allowed bypass of ACL checks if the
request came thru a client node first. But this fix broke (knowingly) the
identification of many client-to-server RPCs. These will be now measured as if
they were anonymous. The reason for this is that many client-to-server RPCs do
not send the node secret and instead rely on the protection of mTLS.

This changeset ensures that the node secret is being sent with every
client-to-server RPC request. In a future version of Nomad we can add
enforcement on the server side, but this was left out of this changeset to
reduce risks to the safe upgrade path.

Sending the node secret as an auth token introduces a new problem during initial
introduction of a client. Clients send many RPCs concurrently with
`Node.Register`, but until the node is registered the node secret is unknown to
the server and will be rejected as invalid. This causes permission denied
errors.

To fix that, this changeset introduces a gate on having successfully made a
`Node.Register` RPC before any other RPCs can be sent (except for `Status.Ping`,
which we need earlier but which also ignores the error because that handler
doesn't do an authorization check). This ensures that we only send requests with
a node secret already known to the server. This also makes client startup a
little easier to reason about because we know `Node.Register` must succeed
first, and it should make for a good place to hook in future plans for secure
introduction of nodes. The tradeoff is that an existing client that has running
allocs will take slightly longer (a second or two) to transition to ready after
a restart, because the transition in `Node.UpdateStatus` is gated at the server
by first submitting `Node.UpdateAlloc` with client alloc updates.
2023-06-22 11:06:49 -04:00
Luiz Aoqui
28206f7210 ci: fix some flaky UI tests (#17648)
These tests would fail depending on the value of the seed used.
2023-06-22 10:51:07 -04:00
Tim Gross
b23fe72fb5 release pipeline: release workflow needs write permissions (#17669)
In #17103 we set read-only permissions on all the workflows. Unfortunately we
missed that the `release` workflow makes git commits and pushes them to the
repository, so it needs to have write permissions.
2023-06-22 10:40:45 -04:00
Luiz Aoqui
876117fadd ui: display mirage scenario in header label (#17649)
This information is useful when switching between different scenarios
for testing.
2023-06-22 10:38:17 -04:00
Seth Hoenig
33ac5ed1df client: do not disable memory swappiness if kernel does not support it (#17625)
* client: do not disable memory swappiness if kernel does not support it

This PR adds a workaround for very old Linux kernels which do not support
the memory swappiness interface file. Normally we write a "0" to the file
to explicitly disable swap. In the case the kernel does not support it,
give libcontainer a nil value so it does not write anything.

Fixes #17448

* client: detect swappiness by writing to the file

* fixup changelog

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

---------

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2023-06-22 09:36:31 -05:00
Luiz Aoqui
df37f2d022 ui: add tooltips to the Topology labels (#17647)
Add tooltips to labels in nodes and datacenters for the Topology view
page to clarify what each value represents.
2023-06-22 10:33:42 -04:00
Luiz Aoqui
a29048b3e7 ui: remove redundant columns from child job table (#17645)
Namespace, job type, and priority are already available from the parent
job header, so displaying them in the table caused it to be too crowded.
2023-06-22 10:22:41 -04:00
James Rasell
b197a9ee26 variables: remove unused state store functions. (#17660) 2023-06-22 13:54:58 +01:00
James Rasell
c19253215b core: use faster concatenation for alloc name generation. (#17591) 2023-06-22 07:46:28 +01:00
Luiz Aoqui
f4c7182873 node pools: apply node pool scheduler configuration (#17598) 2023-06-21 20:31:50 -04:00
Phil Renaud
fe49f22247 Moves to the current LTS release of Node for our build and release workflows (#17639) 2023-06-21 15:17:24 -04:00
Phil Renaud
873acf04b9 [ui] General status for steady-state jobs (#17599)
* Degraded vs Healthy etc. status

* Standardize the look of a deploying status panel

* badge styles

* remove job.status from title component in favour of in-panel status

* Remove a redundant check

* re-attrd fail-deployment button considered
2023-06-21 11:57:28 -04:00
VishnuJin
102f73274b fingerprint: added windows os.build attribute to host fingerprint (#17576) 2023-06-21 10:53:50 -04:00
Michael Lange
04042c619d Merge pull request #17626 from hashicorp/f/ui-test-splitting
[UI, CI] Test splitting
2023-06-20 20:21:37 -07:00
Michael Lange
6c53c1e3d7 Tag the GHA run for percy to use
Percy uses this to stitch parallel test runs back together into a single
report.
2023-06-20 15:38:05 -07:00
Michael Lange
d5767accce Simplify workflows
After renovating everything, it's evident that the ember-exam
sub-workflow can be inlined without any pesky duplication.
2023-06-20 15:05:17 -07:00
Michael Lange
167f5bdfb2 Pipe secrets through to exam job 2023-06-20 14:49:57 -07:00
Michael Lange
7d80d0ed37 Rip out the xUnit test reporter
This was used to integrate with Circle CI's deeper test reporting
(failures, flakes, reporting). It's strictly vestigial now that we're on
GHA.
2023-06-20 14:49:56 -07:00
Michael Lange
7bbc51a854 Use a matrix strategy to run exam partitions
This will run partitions and parallel only after linting passes.
2023-06-20 13:51:45 -07:00
Michael Lange
ad2c8d5ab9 Move the ember exam workflow into its own reusable job
This will be called N times by the parent test-ui script.
2023-06-20 13:51:45 -07:00
Michael Lange
8a19375672 New generic exam:parallel yarn script
This is intended to be used like `yarn exam:parallel -- more --options`

This way a split and partition can be provided by CI without CI also
needing to deal with percy details.
2023-06-20 13:51:45 -07:00
Michael Lange
8e2cb4cf2b Merge pull request #17624 from hashicorp/b/ui-audit-workflow
[UI, CI] Bump the ember-test-audit workflow to node 18
2023-06-20 13:48:31 -07:00
Phil Renaud
59d5a4130f [ui] Keyboard shortcuts for Promote Canary and Fail Deployment (#17568) 2023-06-20 15:43:32 -04:00
Phil Renaud
ea8bf8e7cd Specific health_unknown getter that only looks at running allocs (#17566) 2023-06-20 15:03:46 -04:00
Michael Lange
5e4161f18b Bump the ember-test-audit workflow to node 18 2023-06-20 10:31:24 -07:00
Tim Gross
f73b454dee scheduler: tolerate having only one dynamic port available (#17619)
If the dynamic port range for a node is set so that the min is equal to the max,
there's only one port available and this passes config validation. But the
scheduler panics when it tries to pick a random port. Only add the randomness
when there's more than one to pick from.

Adds a test for the behavior but also adjusts the commentary on a couple of the
existing tests that made it seem like this case was already covered if you
didn't look too closely.

Fixes: #17585
2023-06-20 13:29:25 -04:00
Daniel Bennett
748aea1c61 e2e: fix windows client docker (#17572)
the windows docker install script stopped working.

after trying various things to fix the script,
I opted instead for a base image that comes with
docker already installed.

error output during build was:
  Installing Docker.
  WARNING: Cannot find path 'C:\Users\Administrator\AppData\Local\Temp\DockerMsftProvider\DockerDefault_DockerSearchIndex.json' because it does not exist.
  WARNING: Cannot bind argument to parameter 'downloadURL' because it is an empty string.
  WARNING: The property 'AbsoluteUri' cannot be found on this object. Verify that the property exists.
  WARNING: The property 'RequestMessage' cannot be found on this object. Verify that the property exists.
  Failed to install Docker.
  Install-Package : No match was found for the specified search criteria and package name 'docker'.
2023-06-20 10:17:16 -05:00
Luiz Aoqui
2520cb2f28 test: add MultiregionMinJob mock (#17614) 2023-06-20 10:57:02 -04:00
James Rasell
e4beb12883 state: move variables tests to use must library. (#17609) 2023-06-20 15:46:16 +01:00
James Rasell
9d9cad1686 state: remove vague scaling event schema todo item. (#17610) 2023-06-20 15:22:11 +01:00
Luiz Aoqui
4b707b588e chore: fix typo and copyright header (#17605) 2023-06-20 10:09:47 -04:00
Luiz Aoqui
354e4b2ef2 ci: run 'make check' as reusable workflow (#17600)
Some of the paths ignored by `test-core.yaml` need to be checked by
`make check`. The `checks.yaml` workflow run on these paths and can also
be used as a reusable workflow.
2023-06-20 08:17:13 +01:00
hashicorp-copywrite[bot]
4e2d131d39 [COMPLIANCE] Add Copyright and License Headers (#17596)
Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com>
2023-06-19 12:23:28 -04:00
Phil Renaud
94cb129da5 [ui, deployments] Promote Canary and Unhealthy Allocations in the deployment status panel (#17547)
* A wild health status appears

* autoPromote notification conditions

* Legend fixes etc

* Acceptance tests for new canary alerts
2023-06-19 12:06:18 -04:00
Luiz Aoqui
6c64847e1b np: scheduler configuration updates (#17575)
* jobspec: rename node pool scheduler_configuration

In HCL specifications we usually call configuration blocks `config`
instead of `configuration`.

* np: add memory oversubscription config

* np: make scheduler config ENT
2023-06-19 11:41:46 -04:00
Dao Thanh Tung
e29ad68c58 terraform: fix syntax in Azure example due to deprecated tf resource arguments (#17497) 2023-06-19 11:26:14 +02:00
dependabot[bot]
219f5dd532 build(deps): bump github.com/stretchr/testify from 1.8.2 to 1.8.4 (#17584) 2023-06-19 08:21:45 +01:00
Bruce Lok
8953e78dc4 fix typo peers.json (#17538) 2023-06-19 07:56:51 +01:00
Michael Lange
a6c8a5621d Merge pull request #17573 from hashicorp/f/legacy-openssl
UI Dev Tools: Use the legacy openssl provider for backcompat
2023-06-17 10:24:53 -07:00
Michael Lange
51d72dbf41 Use the legacy openssl provider for backcompat
Node v18 uses a newer version of openssl than webpack 4 is compatible
with. This is the quickest fix.

The ideal fix would be to upgrade webpack to v5 but the state of Ember,
Storybook, and generally just JS dep management makes this not an
option.
2023-06-16 17:58:40 -07:00
Luiz Aoqui
80e1ad68ba cli: prevent panic if job node pool is nil (#17571)
If the `nomad` CLI is used to access a cluster running a version that
does not include node pools the command will `nil` panic when trying to
resolve the job's node pool.
2023-06-16 17:08:36 -04:00
Luiz Aoqui
4f7c38b2a7 node pools: namespace integration (#17562)
Add structs and fields to support the Nomad Pools Governance Enterprise
feature of controlling node pool access via namespaces.

Nomad Enterprise allows users to specify a default node pool to be used
by jobs that don't specify one. In order to accomplish this, it's
necessary to distinguish between a job that explicitly uses the
`default` node pool and one that did not specify any.

If the `default` node pool is set during job canonicalization it's
impossible to do this, so this commit allows a job to have an empty node
pool value during registration but sets to `default` at the admission
controller mutator.

In order to guarantee state consistency the state store validates that
the job node pool is set and exists before inserting it.
2023-06-16 16:30:22 -04:00
Tim Gross
6ea36f248e node pools: support node.pool constraint in scheduler (#17548)
Although most of the time jobs will be assigned to a single node pool, users may
want to set the node pool to "all" and then constraint to a subset of node
pools. Add support for setting a contraint like `${node.pool}`.
2023-06-16 13:31:46 -04:00
Seth Hoenig
f5fcaba1c7 e2e: modernize podman test suite (#17564)
Use the new style of e2e test for the podman suite ... which is all of
one test case that was skipped out. Turn the case back on, and we will
add more tests in the near future.
2023-06-16 10:36:17 -05:00
Tim Gross
2d7bead0ad docs: node pool specification (#17553) 2023-06-16 10:37:47 -04:00
Seth Hoenig
6975409386 e2e: cleanup podman installation in jammy image (#17558)
* e2e: cleanup podman installation in jammy image

The original steps were copied over from the bionic image and does a lot
of hoop jumping we do not need anymore.

For the moment just hard-code installing the v0.4.2 version of the driver,
but I may follow up and modify hc-install to support installing @latest
like go itself.

* use releases for hc-install
2023-06-15 18:17:31 -05:00