Commit Graph

4465 Commits

Author SHA1 Message Date
Tim Gross
81f7e53e0d docs: fix broken link in variables spec page (#17554) 2023-06-15 15:57:00 -04:00
Tim Gross
288ff2f0c4 docs: add missing client.allocs metrics (#17540)
The docs were missing counter metrics emitted by the task runner around task
state changes.
2023-06-15 09:18:11 -04:00
Tim Gross
eee2315d5d docs: clarify node pool apply/delete behavior (#17529) 2023-06-14 15:58:53 -04:00
Tim Gross
068d0ea9af node pools: add pool as label on client metrics (#17528)
This changeset adds the node pool as a label anywhere we're already emitting
labels with additional information such as node class or ID about the client.
2023-06-14 15:58:38 -04:00
Tim Gross
0ac85db680 cli: fix missing -quiet flag for var init (#17526)
The `var init` command was intended to have support for a `-quiet` flag but it
was not documented and never parsed.
2023-06-14 14:52:46 -04:00
Tim Gross
6bd1ebed29 docs: note namespace apply/delete behaviors, fix metric (#17527)
This changeset includes some fixes to documentation discovered while working on
node pools, but we didn't want to include in the node pool PRs so they can get
backported easily:

* namespace apply/delete commands are forwarded to the authoritative region
* deleting a namespace requires there are no non-terminal jobs in any of the
  federated regions
* fixed a typo in the name of the `nomad.client.allocated.disk` metric
2023-06-14 14:52:06 -04:00
Tim Gross
0aeeaf1083 node pools: implement node pool init command (#17479)
Implement a `nomad node pool init` command that generates an example spec file
in either HCL or JSON format.
2023-06-13 14:51:29 -04:00
Luiz Aoqui
5db9e64cdd node pool: node pool upsert on multiregion node register (#17503)
When registering a node with a new node pool in a non-authoritative
region we can't create the node pool because this new pool will not be
replicated to other regions.

This commit modifies the node registration logic to only allow automatic
node pool creation in the authoritative region.

In non-authoritative regions, the client is registered, but the node
pool is not created. The client is kept in the `initialing` status until
its node pool is created in the authoritative region and replicated to
the client's region.
2023-06-13 11:28:28 -04:00
Piotr Kazmierczak
be8f04e89f docs: corrections and additional information for OIDC-related concepts (#17470) 2023-06-09 16:50:22 +02:00
Piotr Kazmierczak
c1a9fe93ac docs: add missing login API endpoint documentation (#17467) 2023-06-09 15:59:01 +02:00
Tim Gross
9a6078a2ae node pools: implement support in scheduler (#17443)
Implement scheduler support for node pool:

* When a scheduler is invoked, we get a set of the ready nodes in the DCs that
  are allowed for that job. Extend the filter to include the node pool.
* Ensure that changes to a job's node pool are picked up as destructive
  allocation updates.
* Add `NodesInPool` as a metric to all reporting done by the scheduler.
* Add the node-in-pool the filter to the `Node.Register` RPC so that we don't
  generate spurious evals for nodes in the wrong pool.
2023-06-07 10:39:03 -04:00
Luiz Aoqui
354d741c95 node pool: implement nomad node pool nodes CLI (#17444) 2023-06-07 10:37:27 -04:00
Tim Gross
84e7cf39f6 node pools: implement CLI for node pool jobs command (#17432) 2023-06-06 15:02:26 -04:00
Tim Gross
385dbfb8d1 node pools: implement HTTP API to list jobs in pool (#17431)
Implements the HTTP API associated with the `NodePool.ListJobs` RPC, including
the `api` package for the public API and documentation.

Update the `NodePool.ListJobs` RPC to fix the missing handling of the special
"all" pool.
2023-06-06 11:40:13 -04:00
Luiz Aoqui
f0f4cbb848 node pools: list nodes in pool (#17413) 2023-06-06 10:43:43 -04:00
Luiz Aoqui
637ddf516e node pools: add event stream support (#17412) 2023-06-06 10:14:47 -04:00
Tim Gross
bb0140803e node pools: implement RPC to list jobs in a given node pool (#17396)
Implements the `NodePool.ListJobs` RPC, with pagination and filtering based on
the existing `Job.List` RPC.
2023-06-05 15:36:52 -04:00
KamilCuk
da9ec8ce1e Add group_add docker option (#17313) 2023-06-02 20:26:01 -04:00
Luiz Aoqui
81f0b359dd node pools: register a node in a node pool (#17405) 2023-06-02 17:50:50 -04:00
Luiz Aoqui
c09ca1e765 node pools: implement CLI (#17388) 2023-06-02 15:49:57 -04:00
Samantha
7ef1905333 check: Add support for Consul field tls_server_name (#17334) 2023-06-02 10:19:12 -04:00
Luiz Aoqui
9ee68fc02c node pool: add search support (#17385) 2023-06-01 17:48:14 -04:00
Tim Gross
2d059bbf22 node pools: add node_pool field to job spec (#17379)
This changeset only adds the `node_pool` field to the jobspec, and ensures that
it gets picked up correctly as a change. Without the rest of the implementation
landed yet, the field will be ignored.
2023-06-01 16:08:55 -04:00
Luiz Aoqui
970e998b00 node pools: add CRUD API (#17384) 2023-06-01 15:55:49 -04:00
Luiz Aoqui
2ae7cd384d np: implement ACL for node pools (#17365) 2023-06-01 13:03:20 -04:00
Seth Hoenig
9ff1d927d9 docs: fixup example of readiness check (#17296)
A "readiness" check implies a failing healthcheck will not cause the
deployment of a service to stop - i.e. it is only used as a liveness
probe in the context of service discoverability.

Fix our docs example to reflect that a readiness check is created by
setting on_update to "ignore" (as opposed to "ignore_warnings").
2023-05-23 15:29:10 -05:00
Tim Gross
bd59893956 build: remove 386 builds for Nomad 1.6.0 (#17239)
The 32-bit Intel builds (aka "386") are not tested and likely have bugs
involving platform-sized integers when operated at any non-trivial scale. Remove
these builds from the upcoming Nomad 1.6.0 and provide recommendations in the
upgrade notes for those users who might have hobbyist boards running 32-bit
ARM (this will primarily be the RaspberryPi Zero or older spins of the RaspPi).

DO NOT BACKPORT TO 1.5.x OR EARLIER!
2023-05-22 13:27:17 -04:00
Lance Haig
7e93f150b5 cli: tls certs not created with correct SANs (#16959)
The `nomad tls cert` command did not create certificates with the correct SANs for
them to work with non default domain and region names. This changset updates the
code to support non default domains and regions in the certificates.
2023-05-22 09:31:56 -04:00
Tim Gross
87ea64c583 document which fields can be updated by volume register (#17249)
The `volume register` command can update a small subset of the volume's fields
in-place, with some restrictions depending on whether the volume is currently in
use. Document these in the `volume register` command docs and the volume
specification docs.

Fixes: #17247
2023-05-22 09:15:25 -04:00
Tim Gross
2275a83cbf docs: describe the default Workload Identity ACL policy (#17245)
Workload Identities have an implicit default policy. This policy can't currently
be described via HCL because it includes task interpolation for Variables and
access to the Services API (which doesn't exist as its own ACL
capbility). Describe this in our WI documentation.

Fixes: #16277
2023-05-19 11:38:05 -04:00
Mike Nomitch
3db97acf8a docs: add documentation on ephemeral disk and logs (#15829) 2023-05-17 16:58:11 -04:00
Roman Zipp
22f5217b85 docs: remove unneeded brackets from job specification template docs (#17219) 2023-05-17 16:45:00 -04:00
dependabot[bot]
eafdbd4826 build(deps-dev): bump prettier from 2.2.1 to 2.8.8 in /website (#16965)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-16 12:12:53 -05:00
Tim Gross
bf7b82b52b drivers: make internal DisableLogCollection capability public (#17196)
The `DisableLogCollection` capability was introduced as an experimental
interface for the Docker driver in 0.10.4. The interface has been stable and
allowing third-party task drivers the same capability would be useful for those
drivers that don't need the additional overhead of logmon.

This PR only makes the capability public. It doesn't yet add it to the
configuration options for the other internal drivers.

Fixes: #14636 #15686
2023-05-16 09:16:03 -04:00
dependabot[bot]
a430abedf0 build(deps-dev): bump @hashicorp/platform-content-conformance (#17030)
Bumps @hashicorp/platform-content-conformance from 0.0.10 to 0.0.11.

---
updated-dependencies:
- dependency-name: "@hashicorp/platform-content-conformance"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-15 11:28:03 -04:00
dependabot[bot]
d706b8f1c9 build(deps-dev): bump next from 12.3.1 to 13.4.2 in /website (#17177)
Bumps [next](https://github.com/vercel/next.js) from 12.3.1 to 13.4.2.
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v12.3.1...v13.4.2)

---
updated-dependencies:
- dependency-name: next
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-15 10:46:40 -04:00
Mark Lewis
e329a45b98 Update delete.mdx (#17184)
Fix typo
2023-05-15 13:31:52 +01:00
Tim Gross
6155ba3bcf docs: add note to upgrade guide about yanked version (#17115)
Nomad 1.5.4 shipped with a logmon bug that we rolled out a fix for in Nomad
1.5.5. Unfortunately we can't yank the release but we should leave a note in the
upgrade guide telling users to avoid it.
2023-05-08 13:28:45 -04:00
Tim Gross
3ee02ebc97 post release 1.5.5 (#17098)
* changelog entries for 1.5.5 and missing merge of changelog for 1.5.4, 1.4.9,
  and 1.3.14
* note on deprecation of `logs.enabled` field
2023-05-05 11:46:08 -04:00
Tim Gross
2aa3c746c4 logs: fix missing allocation logs after update to Nomad 1.5.4 (#17087)
When the server restarts for the upgrade, it loads the `structs.Job` from the
Raft snapshot/logs. The jobspec has long since been parsed, so none of the
guards around the default value are in play. The empty field value for `Enabled`
is the zero value, which is false.

This doesn't impact any running allocation because we don't replace running
allocations when either the client or server restart. But as soon as any
allocation gets rescheduled (ex. you drain all your clients during upgrades),
it'll be using the `structs.Job` that the server has, which has `Enabled =
false`, and logs will not be collected.

This changeset fixes the bug by adding a new field `Disabled` which defaults to
false (so that the zero value works), and deprecates the old field.

Fixes #17076
2023-05-04 16:01:18 -04:00
Seth Hoenig
a7a2a3ce56 docs: move CNI reference plugins installation to CNI overview page (#17068)
* docs: move CNI reference plugins installation to CNI overview page

This PR moves the instruction steps for install the CNI reference plugins
from the Consul Mesh integration page to the general Networking CNI page.

These plugins are required for bridge networking, not just Consul Mesh,
so it makes sense to have them on the general CNI page.

Closes #17038

* docs: fix a link to post install steps
2023-05-04 11:32:06 -05:00
James Rasell
06e877f26b docs: update artifact jobspec sshkey example path. (#17077) 2023-05-04 14:29:36 +01:00
Seth Hoenig
7744caed48 connect: use explicit docker.io prefix in default envoy image names (#17045)
This PR modifies references to the envoyproxy/envoy docker image to
explicitly include the docker.io prefix. This does not affect existing
users, but makes things easier for Podman users, who otherwise need to
specify the full name because Podman does not default to docker.io
2023-05-02 09:27:48 -05:00
Seth Hoenig
8919997896 docs: add more notes about artifact breaking changes in 1.5.0 (#17005)
* changelog: note artifact breaking changes for 1.5.0

* docs: add note about environment variables to artifact job spec docs

* Update website/content/docs/job-specification/artifact.mdx

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>

---------

Co-authored-by: Luiz Aoqui <luiz@hashicorp.com>
2023-04-27 11:41:18 -05:00
James Rasell
3fc2054f85 docs: use appropriate file extension for autoscaler agent config. (#16993) 2023-04-27 15:00:28 +01:00
Tim Gross
30bc456f03 logs: allow disabling log collection in jobspec (#16962)
Some Nomad users ship application logs out-of-band via syslog. For these users
having `logmon` (and `docker_logger`) running is unnecessary overhead. Allow
disabling the logmon and pointing the task's stdout/stderr to /dev/null.

This changeset is the first of several incremental improvements to log
collection short of full-on logging plugins. The next step will likely be to
extend the internal-only task driver configuration so that cluster
administrators can turn off log collection for the entire driver.

---

Fixes: #11175

Co-authored-by: Thomas Weber <towe75@googlemail.com>
2023-04-24 10:00:27 -04:00
Tim Gross
47b28a6468 docs: fix keyring path in install docs (#16946) 2023-04-20 16:20:39 -04:00
Luiz Aoqui
e169847a64 docs: add missing field Capabilities to Namespace API (#16931) 2023-04-19 08:14:36 -07:00
Luiz Aoqui
ba364fc400 docs: add missing API field JobACL and fix workload identity headers (#16930) 2023-04-19 08:12:58 -07:00
Chris van Meer
c7c8bd4ce2 Updates to the UI block (#16328)
1. On the Consul address, following the recommendation for the HTTPS
   API on port 8501.
2. Add the hint to use HEX values for the colors.
2023-04-18 18:28:17 -07:00