1339 Commits

Author SHA1 Message Date
Michael Smithhisler
f2b761f17c disconnected: removes deprecated disconnect fields (#25284)
The group level fields stop_after_client_disconnect,
max_client_disconnect, and prevent_reschedule_on_lost were deprecated in
Nomad 1.8 and replaced by field in the disconnect block. This change
removes any logic related to those deprecated fields.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2025-03-05 14:46:02 -05:00
James Rasell
7268053174 vault: Remove legacy token based authentication workflow. (#25155)
The legacy workflow for Vault whereby servers were configured
using a token to provide authentication to the Vault API has now
been removed. This change also removes the workflow where servers
were responsible for deriving Vault tokens for Nomad clients.

The deprecated Vault config options used byi the Nomad agent have
all been removed except for "token" which is still in use by the
Vault Transit keyring implementation.

Job specification authors can no longer use the "vault.policies"
parameter and should instead use "vault.role" when not using the
default workload identity.

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2025-02-28 07:40:02 +00:00
Piotr Kazmierczak
58c6387323 stateful deployments: task group host volume claims API (#25114)
This PR introduces API endpoints /v1/volumes/claims/ and /v1/volumes/claim/:id
for listing and deleting task group host volume claims, respectively.
2025-02-25 15:51:59 +01:00
Piotr Kazmierczak
611452e1af stateful deployments: use TaskGroupVolumeClaim table to associate volume requests with volume IDs (#24993)
We introduce an alternative solution to the one presented in #24960 which is
based on the state store and not previous-next allocation tracking in the
reconciler. This new solution reduces cognitive complexity of the scheduler
code at the cost of slightly more boilerplate code, but also opens up new
possibilities in the future, e.g., allowing users to explicitly "un-stick"
volumes with workloads still running.

The diagram below illustrates the new logic:

     SetVolumes()                                               upsertAllocsImpl()          
     sets ns, job                             +-----------------checks if alloc requests    
     tg in the scheduler                      v                 sticky vols and consults    
            |                  +-----------------------+        state. If there is no claim,
            |                  | TaskGroupVolumeClaim: |        it creates one.             
            |                  | - namespace           |                                    
            |                  | - jobID               |                                    
            |                  | - tg name             |                                    
            |                  | - vol ID              |                                    
            v                  | uniquely identify vol |                                    
     hasVolumes()              +----+------------------+                                    
     consults the state             |           ^                                           
     and returns true               |           |               DeleteJobTxn()              
     if there's a match <-----------+           +---------------removes the claim from      
     or if there is no                                          the state                   
     previous claim                                                                         
|                             | |                                                      |    
+-----------------------------+ +------------------------------------------------------+    
                                                                                            
           scheduler                                  state store
2025-02-07 17:41:01 +01:00
Michael Smithhisler
47c14ddf28 remove remote task execution code (#24909) 2025-01-29 08:08:34 -05:00
Tim Gross
09eb473189 dynamic host volumes: set status unavailable on failed restore (#24962)
When a client restarts but can't restore a volume (ex. the plugin is now
missing), it's removed from the node fingerprint. So we won't allow future
scheduling of the volume, but we were not updating the volume state field to
report this reasoning to operators. Make debugging easier and the state field
more meaningful by setting the value to "unavailable".

Also, remove the unused "deleted" field. We did not implement soft deletes and
aren't planning on it for Nomad 1.10.0.

Ref: https://hashicorp.atlassian.net/browse/NET-11551
2025-01-27 16:35:53 -05:00
Michael Smithhisler
d621211108 auth: adds option to enable verbose logging during sso (#24892)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2025-01-23 11:40:01 -05:00
Piotr Kazmierczak
ebffcce378 stateful deployments: remove CSIVolumeIDs (#24908) 2025-01-21 17:00:55 +01:00
Tim Gross
96e539ee87 dynamic host volumes quotas (#24871)
Allow users to configure a host volumes quota in MB. This will be enforced at
the time of provisioning via create/register RPCs. This changeset is the CE
version of ENT/2114.

Ref: https://github.com/hashicorp/nomad-enterprise/pull/2114
Ref: https://hashicorp.atlassian.net/browse/NET-11549
2025-01-17 11:41:56 -05:00
Tim Gross
203a6533bb API: host volume access modes should match list in structs package (#24838)
We changed the list of access modes available for dynamic host volumes in #24705
but neglected to change them in the API package. Update the API package to
match.

Ref: https://github.com/hashicorp/nomad/pull/24705
2025-01-13 15:00:22 -05:00
Tim Gross
3a11a0b1e1 quotas: refactor storage limit specification (#24785)
In anticipation of having quotas for dynamic host volumes, we want the user
experience of the storage limits to feel integrated with the other resource
limits. This is currently prevented by reusing the `Resources` type instead of
having a specific type for `QuotaResources`.

Update the quota limit/usage types to use a `QuotaResources` that includes a new
storage resources quota block. The wire format for the two types are compatible
such that we can migrate the existing variables limit in the FSM.

Also fixes improper parallelism in the quota init test where we change working
directory to avoid file write conflicts but this breaks when multiple tests are
executed in the same process.

Ref: https://github.com/hashicorp/nomad-enterprise/pull/2096
2025-01-13 09:25:00 -05:00
Tim Gross
4a65b21aab dynamic host volumes: send register to client for fingerprint (#24802)
When we register a volume without a plugin, we need to send a client RPC so that
the node fingerprint can be updated. The registered volume also needs to be
written to client state so that we can restore the fingerprint after a restart.

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2025-01-08 16:58:58 -05:00
Piotr Kazmierczak
0906f788f0 keyring: warn if removing a key that was used for encrypting variables (#24766)
Adds an additional check in the Keyring.Delete RPC to make sure we're not
trying to delete a key that's been used to encrypt a variable. It also adds a
-force flag for the CLI/API to sidestep that check.
2025-01-07 10:15:02 +01:00
Piotr Kazmierczak
8cbb74786c stateful deployments: find feasible node for sticky host volumes (#24558)
This changeset implements node feasibility checks for sticky host volumes.
2024-12-19 09:25:55 -05:00
Piotr Kazmierczak
967addec48 stateful deployments: add corrections to API structs and methods (#24700)
This changeset includes changes accidentally left out from 24641.
2024-12-19 09:25:54 -05:00
Daniel Bennett
e76f5e0b4c dynamic host volumes: volume fingerprinting (#24613)
and expand the demo a bit
2024-12-19 09:25:54 -05:00
Daniel Bennett
5826e92671 dynamic host volumes: delete by single volume ID (#24606)
string instead of []string
2024-12-19 09:25:54 -05:00
Tim Gross
787fbbe671 sentinel: remove default scope for Sentinel apply command (#24601)
When we add a Sentinel scope for dynamic host volumes, having a default `-scope`
value for `sentinel apply` risks accidentally adding policies for volumes to the
job scope. This would immediately prevent any job from being submitted. Forcing
the administrator to pass a `-scope` will prevent accidental misuse.

Ref: https://github.com/hashicorp/nomad-enterprise/pull/2087
Ref: https://github.com/hashicorp/nomad/pull/24479
2024-12-19 09:25:54 -05:00
Tim Gross
d700538921 dynamic host volumes: Sentinel improvements for CLI (#24592)
The create/register volume RPCs support a policy override flag for
soft-mandatory Sentinel policies, but the CLI and Go API were missing support
for it.

Also add support for Sentinel warnings to the Go API and CLI.

Ref: https://github.com/hashicorp/nomad/pull/24479
2024-12-19 09:25:54 -05:00
Tim Gross
d1352b285d dynamic host volumes: Enterprise stubs and refactor API (#24545)
Most Nomad upsert RPCs accept a single object with the notable exception of
CSI. But in CSI we don't actually expose this to users except through the Go
API. It deeply complicates how we present errors to users, especially once
Sentinel policy enforcement enters the mix.

Refactor the `HostVolume.Create` and `HostVolume.Register` RPCs to take a single
volume instead of a slice of volumes.

Add a stub function for Enterprise policy enforcement. This requires splitting
out placement from the `createVolume` function so that we can ensure we've
completed placement before trying to enforce policy.

Ref: https://github.com/hashicorp/nomad/pull/24479
2024-12-19 09:25:54 -05:00
Tim Gross
926925ba16 dynamic host volumes: search endpoint (#24531)
Add support for dynamic host volumes to the search endpoint. Like many other
objects with UUID identifiers, we're not supporting fuzzy search here, just
prefix search on the fuzzy search endpoint.

Because the search endpoint only returns IDs, we need to seperate CSI volumes
and host volumes for it to be useful. The new context is called `"host_volumes"`
to disambiguate it from `"volumes"`. In future versions of Nomad we should
consider deprecating the `"volumes"` context in lieu of a `"csi_volumes"`
context.

Ref: https://github.com/hashicorp/nomad/pull/24479
2024-12-19 09:25:54 -05:00
Tim Gross
7c85176059 dynamic host volumes: basic CLI CRUD operations (#24382)
This changeset implements a first pass at the CLI for Dynamic Host Volumes.

Ref: https://hashicorp.atlassian.net/browse/NET-11549
2024-12-19 09:25:54 -05:00
Tim Gross
75b0202f7f api: don't copy previously parsed URL when setting new address (#24644)
In #16872 we added support for unix domain sockets, but this required mutating
the `Config` when parsing the address so as to remove the port number. In #23785
we fixed a bug where if the configuration was used across multiple clients that
mutation would happen multiple times and the address would be incorrectly
parsed.

When making `alloc log`, `alloc fs`, or `alloc exec` calls where we have
line-of-sight to the client, we attempt to make a HTTP API call directly to the
client node. So we create a new API client from the same configuration and then
set the address. But in this case we copy the private `url` field and that
causes the URL parsing to be skipped for the new client.

This results in the region always being set to the string literal
`"global"` (because of mTLS handling code introduced all the way back in
4d3b75d867), unless the user has set the region specifically. This fails with
an error "no path to region" when the cluster isn't non-global and requests are
sent to a non-leader.

Arguably the "right" way of fixing this would be for `ClientConfig` not to
change the API client's region to `"global"` in the first place, but as this is
a public API and extremely longstanding behavior, it could potentially be a
breaking change for some downstream consumers. Instead, we'll avoid copying the
private `url` field so that the new address is re-parsed.

Fixes: https://github.com/hashicorp/nomad/issues/24635
Fixes: https://github.com/hashicorp/nomad/issues/24609
Ref: https://github.com/hashicorp/nomad/pull/16872
Ref: https://github.com/hashicorp/nomad/pull/23785
Ref: 4d3b75d867
2024-12-16 11:05:29 -05:00
Juana De La Cuesta
c21dfdb17a [gh-476] Sanitise HCL variables before storing on job submission (#24423)
* func: User url rules to scape non alphanumeric values in hcl variables

* docs: add changelog

* func: unscape flags before returning

* use JSON.stringify instead of bespoke value quoting to handle in-value-multi-line cases

---------

Co-authored-by: Phil Renaud <phil@riotindustries.com>
2024-11-22 11:45:02 +01:00
Piotr Kazmierczak
f7847c6e5b state: remove TimeTable and rely on objects' modify times instead (#24112)
Core scheduler relies on a special table in the state store—the TimeTable—to
figure out which objects can be GC'd. The TimeTable correlates Raft indices
with objects insertion time, a solution we used before most of the objects we
store in the state contained timestamps. This introduced a bit of a memory
overhead and complexity, but most importantly meant that any GC threshold users
set greater than timeTableLimit = 72 * time.Hour was ignored. This PR removes
the TimeTable and relies on object timestamps to determine whether they could
be GCd or not.
2024-11-01 19:38:04 +01:00
Martijn Vegter
6236f354a5 consul: add support for service weight (#24186) 2024-10-25 11:21:38 -04:00
Michael Schurter
cbbe6bb389 docs: explain schedule state values (#24160)
* docs: explain schedule state values

GET /v1/client/allocation/:alloc_id/pause?task=:task_name is a tiny but
critical API for observability of tasks with a schedule. This PR
explains each of the values which might be returned.

* correct docstring

* add missing state and expand PUT docs

---------

Co-authored-by: Aimee Ukasick <aimee.ukasick@hashicorp.com>
2024-10-17 11:42:12 -07:00
Seth Hoenig
f1ce127524 jobspec: add a chown option to artifact block (#24157)
* jobspec: add a chown option to artifact block

This PR adds a boolean 'chown' field to the artifact block.

It indicates whether the Nomad client should chown the downloaded files
and directories to be owned by the task.user. This is useful for drivers
like raw_exec and exec2 which are subject to the host filesystem user
permissions structure. Before, these drivers might not be able to use or
manage the downloaded artifacts since they would be owned by the root
user on a typical Nomad client configuration.

* api: no need for pointer of chown field
2024-10-11 11:30:27 -05:00
Phil Renaud
e206993d49 Feature: Golden Versions (#24055)
* TaggedVersion information in structs, rather than job_endpoint (#23841)

* TaggedVersion information in structs, rather than job_endpoint

* Test for taggedVersion description length

* Some API plumbing

* Tag and Untag job versions (#23863)

* Tag and Untag at API level on down, but am I unblocking the wrong thing?

* Code and comment cleanup

* Unset methods generally now I stare long into the namespace abyss

* Namespace passes through with QueryOptions removed from a write requesting struct

* Comment and PR review cleanup

* Version back to VersionStr

* Generally consolidate unset logic into apply for version tagging

* Addressed some PR comments

* Auth check and RPC forwarding

* uint64 instead of pointer for job version after api layer and renamed copy

* job tag command split into apply and unset

* latest-version convenience handling moved to CLI command level

* CLI tests for tagging/untagging

* UI parts removed

* Add to job table when unsetting job tag on latest version

* Vestigial no more

* Compare versions by name and version number with the nomad history command (#23889)

* First pass at passing a tagname and/or diff version to plan/versions requests

* versions API now takes compare_to flags

* Job history command output can have tag names and descriptions

* compare_to to diff-tag and diff-version, plus adding flags to history command

* 0th version now shows a diff if a specific diff target is requested

* Addressing some PR comments

* Simplify the diff-appending part of jobVersions and hide None-type diffs from CLI

* Remove the diff-tag and diff-version parts of nomad job plan, with an eye toward making them a new top-level CLI command soon

* Version diff tests

* re-implement JobVersionByTagName

* Test mods and simplification

* Documentation for nomad job history additions

* Prevent pruning and reaping of TaggedVersion jobs (#23983)

tagged versions should not count against JobTrackedVersions
i.e. new job versions being inserted should not evict tagged versions

and GC should not delete a job if any of its versions are tagged

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>

---------

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>

* [ui] Version Tags on the job versions page (#24013)

* Timeline styles and their buttons modernized, and tags added

* styled but not yet functional version blocks

* Rough pass at edit/unedit UX

* Styles consolidated

* better UX around version tag crud, plus adapter and serializers

* Mirage and acceptance tests

* Modify percy to not show time-based things

---------

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>

* Job revert command and API endpoint can take a string version tag name (#24059)

* Job revert command and API endpoint can take a string version tag name

* RevertOpts as a signature-modified alternative to Revert()

* job revert CLI test

* Version pointers in endpoint tests

* Dont copy over the tag when a job is reverted to a version with a tag

* Convert tag name to version number at CLI level

* Client method for version lookup by tag

* No longer double-declaring client

* [ui] Add tag filter to the job versions page (#24064)

* Rough pass at the UI for version diff dropdown

* Cleanup and diff fetching via adapter method

* TaggedVersion now VersionTag (#24066)

---------

Co-authored-by: Daniel Bennett <dbennett@hashicorp.com>
2024-09-25 19:59:16 -04:00
Michael Smithhisler
6b6aa7cc26 identity: adds ability to specify custom filepath for saving workload identities (#24038) 2024-09-23 10:27:00 -04:00
Daniel Bennett
ec81e7c57c networking: add ignore_collision for static port{} (#23956)
so more than one copy of a program can run
at a time on the same port with SO_REUSEPORT.

requires host network mode.

some task drivers (like docker) may also need
config {
  network_mode = "host"
}
but this is not validated prior to placement.
2024-09-17 16:01:48 -05:00
Daniel Bennett
5e1fae2856 networking: set alloc NetworkStatus.AddressIPv6 (#23959)
when a CNI result includes an IPv6 address,
set it on the alloc's NetworkStatus for reference.

e.g.:

$ nomad alloc status -json 3dca | jq '.NetworkStatus'
{
  "Address": "172.26.64.14",
  "AddressIPv6": "fd00:a110:c8::b",
  "DNS": null,
  "InterfaceName": "eth0"
}
2024-09-16 10:21:52 -05:00
Tim Gross
a9beef7edd jobspec: remove HCL1 support (#23912)
This changeset removes support for parsing jobspecs via the long-deprecated
HCLv1.

Fixes: https://github.com/hashicorp/nomad/issues/20195
Ref: https://hashicorp.atlassian.net/browse/NET-10220
2024-09-05 09:02:45 -04:00
Piotr Kazmierczak
9265b384b3 quota: parse device block (#23866) 2024-08-28 18:45:12 +02:00
Seth Hoenig
8b093a6a5d scheduler: support for device - aware numa scheduling (#1760) (#23837)
(CE backport of ENT 59433d56c7215c0b8bf33764f41b57d9bd30160f (without ent files))

* scheduler: enhance numa aware scheduling with support for devices

* cr: add comments
2024-08-20 07:53:04 -05:00
Florian Apolloner
d6be784e2d namespaces: add allowed network modes to capabilities. (#23813) 2024-08-16 09:47:19 -04:00
Tim Gross
b7419bc940 api: only set url field in config if previously unset (#23785)
In #16872 we added support for configuring the API client with a unix domain
socket. In order to set the host correctly, we parse the address before mutating
the Address field in the configuration. But this prevents the configuration from
being reused across multiple clients, as the next time we parse the address it
will no longer be pointing to the socket. This breaks consumers like the
autoscaler, which reuse the API config between plugins.

Update the `NewClient` constructor to only override the `url` field if it hasn't
already been parsed. Include a test demonstrating safe reuse with a unix domain
socket.

Ref: https://github.com/hashicorp/nomad-autoscaler/issues/944
Ref: https://github.com/hashicorp/nomad-autoscaler/pull/945
2024-08-09 13:28:04 -04:00
Tim Gross
b25f1b66ce resources: allow job authors to configure size of secrets tmpfs (#23696)
On supported platforms, the secrets directory is a 1MiB tmpfs. But some tasks
need larger space for downloading large secrets. This is especially the case for
tasks using `templates`, which need extra room to write a temporary file to the
secrets directory that gets renamed to the old file atomically.

This changeset allows increasing the size of the tmpfs in the `resources`
block. Because this is a memory resource, we need to include it in the memory we
allocate for scheduling purposes. The task is already prevented from using more
memory in the tmpfs than the `resources.memory` field allows, but can bypass
that limit by writing to the tmpfs via `template` or `artifact` blocks.

Therefore, we need to account for the size of the tmpfs in the allocation
resources. Simply adding it to the memory needed when we create the allocation
allows it to be accounted for in all downstream consumers, and then we'll
subtract that amount from the memory resources just before configuring the task
driver.

For backwards compatibility, the default value of 1MiB is "free" and ignored by
the scheduler. Otherwise we'd be increasing the allocated resources for every
existing alloc, which could cause problems across upgrades. If a user explicitly
sets `resources.secrets = 1` it will no longer be free.

Fixes: https://github.com/hashicorp/nomad/issues/2481
Ref: https://hashicorp.atlassian.net/browse/NET-10070
2024-08-05 16:06:58 -04:00
Tim Gross
2f4353412d keyring: support prepublishing keys (#23577)
When a root key is rotated, the servers immediately start signing Workload
Identities with the new active key. But workloads may be using those WI tokens
to sign into external services, which may not have had time to fetch the new
public key and which might try to fetch new keys as needed.

Add support for prepublishing keys. Prepublished keys will be visible in the
JWKS endpoint but will not be used for signing or encryption until their
`PublishTime`. Update the periodic key rotation to prepublish keys at half the
`root_key_rotation_threshold` window, and promote prepublished keys to active
after the `PublishTime`.

This changeset also fixes two bugs in periodic root key rotation and garbage
collection, both of which can't be safely fixed without implementing
prepublishing:

* Periodic root key rotation would never happen because the default
  `root_key_rotation_threshold` of 720h exceeds the 72h maximum window of the FSM
  time table. We now compare the `CreateTime` against the wall clock time instead
  of the time table. (We expect to remove the time table in future work, ref
  https://github.com/hashicorp/nomad/issues/16359)
* Root key garbage collection could GC keys that were used to sign
  identities. We now wait until `root_key_rotation_threshold` +
  `root_key_gc_threshold` before GC'ing a key.
* When rekeying a root key, the core job did not mark the key as inactive after
  the rekey was complete.

Ref: https://hashicorp.atlassian.net/browse/NET-10398
Ref: https://hashicorp.atlassian.net/browse/NET-10280
Fixes: https://github.com/hashicorp/nomad/issues/19669
Fixes: https://github.com/hashicorp/nomad/issues/23528
Fixes: https://github.com/hashicorp/nomad/issues/19368
2024-07-19 13:29:41 -04:00
Martina Santangelo
661011f5de cni: allow users to set CNI args in job spec (#23538) 2024-07-12 11:47:15 -04:00
Piotr Kazmierczak
fa8ffedd74 api: handle newlines in JobSubmission vars correctly (#23560)
Fixes a bug where variable values in job submissions that contained newlines
weren't encoded correctly, and thus jobs that contained them couldn't be
resumed once stopped via the UI.

Internal ref: https://hashicorp.atlassian.net/browse/NET-9966
2024-07-12 08:04:27 +02:00
Tim Gross
cd3101d624 scale: add -check-index to job scale command (#23457)
The RPC handler for scaling a job passes flags to enforce the job modify index
is unchanged when it makes the write to Raft. But its only checking against the
existing job modify index at the time the RPC handler snapshots the state store,
so it can only enforce consistency for its own validation.

In clusters with automated scaling, it would be useful to expose the enforce
index options to the API, so that cluster admins can enforce that scaling only
happens when the job state is consistent with a state they've previously seen in
other API calls. Add this option to the CLI and API and have the RPC handler
check them if asked.

Fixes: https://github.com/hashicorp/nomad/issues/23444
2024-06-27 16:54:06 -04:00
Daniel Bennett
cfeedd05e8 api: use the task in Allocations.GetPauseState (#23377) 2024-06-18 12:31:12 -05:00
Tim Gross
fa70267787 scheduler: RescheduleTracker dropped if follow-up fails placements (#12319)
When an allocation fails it triggers an evaluation. The evaluation is processed
and the scheduler sees it needs to reschedule, which triggers a follow-up
eval. The follow-up eval creates a plan to `(stop 1) (place 1)`. The replacement
alloc has a `RescheduleTracker` (or gets its `RescheduleTracker` updated).

But in the case where the follow-up eval can't place all allocs (there aren't
enough resources), it can create a partial plan to `(stop 1) (place 0)`. It then
creates a blocked eval. The plan applier stops the failed alloc. Then when the
blocked eval is processed, the job is missing an allocation, so the scheduler
creates a new allocation. This allocation is _not_ a replacement from the
perspective of the scheduler, so it's not handed off a `RescheduleTracker`.

This changeset fixes this by annotating the reschedule tracker whenever the
scheduler can't place a replacement allocation. We check this annotation for
allocations that have the `stop` desired status when filtering out allocations
to pass to the reschedule tracker. I've also included tests that cover this case
and expands coverage of the relevant area of the code.

Fixes: https://github.com/hashicorp/nomad/issues/12147
Fixes: https://github.com/hashicorp/nomad/issues/17072
2024-06-10 11:15:40 -04:00
nicoche
ffcb72bfe3 api: Add Notes field to service checks (#22397)
Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>
2024-06-10 16:59:49 +02:00
Daniel Bennett
4415fabe7d jobspec: time based task execution (#22201)
this is the CE side of an Enterprise-only feature.
a job trying to use this in CE will fail to validate.

to enable daily-scheduled execution entirely client-side,
a job may now contain:

task "name" {
  schedule {
    cron {
      start    = "0 12 * * * *" # may not include "," or "/"
      end      = "0 16"         # partial cron, with only {minute} {hour}
      timezone = "EST"          # anything in your tzdb
    }
  }
...

and everything about the allocation will be placed as usual,
but if outside the specified schedule, the taskrunner will block
on the client, waiting on the schedule start, before proceeding
with the task driver execution, etc.

this includes a taksrunner hook, which watches for the end of
the schedule, at which point it will kill the task.

then, restarts-allowing, a new task will start and again block
waiting for start, and so on.

this also includes all the plumbing required to pipe API calls
through from command->api->agent->server->client, so that
tasks can be force-run, force-paused, or resume the schedule
on demand.
2024-05-22 15:40:25 -05:00
Phil Renaud
e8b77fcfa0 [ui] Jobspec UI block: Descriptions and Links (#18292)
* Hacky but shows links and desc

* markdown

* Small pre-test cleanup

* Test for UI description and link rendering

* JSON jobspec docs and variable example job get UI block

* Jobspec documentation for UI block

* Description and links moved into the Title component and made into Helios components

* Marked version upgrade

* Allow links without a description and max description to 1000 chars

* Node 18 for setup-js

* markdown sanitization

* Ui to UI and docs change

* Canonicalize, copy and diff for job.ui

* UI block added to testJob for structs testing

* diff test

* Remove redundant reset

* For readability, changing the receiving pointer of copied job variables

* TestUI endpiont conversion tests

* -require +must

* Nil check on Links

* JobUIConfig.Links as pointer

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
2024-05-22 15:00:45 -04:00
Deniz Onur Duzgun
1cc99cc1b4 bug: resolve type conversion alerts (#20553) 2024-05-15 13:22:10 -04:00
Tim Gross
c9fd93c772 connect: support volume_mount blocks for sidecar task overrides (#20575)
Users can override the default sidecar task for Connect workloads. This sidecar
task might need access to certificate stores on the host. Allow adding the
`volume_mount` block to the sidecar task override.

Also fixes a bug where `volume_mount` blocks would not appear in plan diff
outputs.

Fixes: https://github.com/hashicorp/nomad/issues/19786
2024-05-14 12:49:37 -04:00
Daniel Bennett
cf87a556b3 api: new /v1/jobs/statuses endpoint for /ui/jobs page (#20130)
introduce a new API /v1/jobs/statuses, primarily for use in the UI,
which collates info about jobs, their allocations, and latest deployment.

currently the UI gets *all* of /v1/jobs and sorts and paginates them client-side
in the browser, and its "summary" column is based on historical summary data
(which can be visually misleading, and sometimes scary when a job has failed
at some point in the not-yet-garbage-collected past).

this does pagination and filtering and such, and returns jobs sorted by ModifyIndex,
so latest-changed jobs still come first. it pulls allocs and latest deployment
straight out of current state for more a more robust, holistic view of the job status.
it is less efficient per-job, due to the extra state lookups, but should be more efficient
per-page (excepting perhaps for job(s) with very-many allocs).

if a POST body is sent like `{"jobs": [{"namespace": "cool-ns", "id": "cool-job"}]}`,
then the response will be limited to that subset of jobs. the main goal here is to
prevent "jostling" the user in the UI when jobs come into and out of existence.

and if a blocking query is started with `?index=N`, then the query should only
unblock if jobs "on page" change, rather than any change to any of the state
tables being queried ("jobs", "allocs", and "deployment"), to save unnecessary
HTTP round trips.
2024-05-03 15:01:40 -05:00