Files
nomad/.changelog/25850.txt
Allison Larson fd16f80b5a Only error on constraints if no allocs are running (#25850)
* Only error on constraints if no allocs are running

When running `nomad job run <JOB>` multiple times with constraints
defined, there should be no error as a result of filtering out nodes
that do not/have not ever satsified the constraints.

When running a systems job with constraint, any run after an initial
startup returns an exit(2) and a warning about unplaced allocations due
to constraints. An error that is not encountered on the initial run,
though the constraint stays the same.

This is because the node that satisfies the condition is already running
the allocation, and the placement is ignored. Another placement is
attempted, but the only node(s) left are the ones that do not satisfy
the constraint. Nomad views this case (no allocations that were
attempted to placed could be placed successfully) as an error, and
reports it as such. In reality, no allocations should be placed or
updated in this case, but it should not be treated as an error.

This change uses the `ignored` placements from diffSystemAlloc to attempt to
determine if the case encountered is an error (no ignored placements
means that nothing is already running, and is an error), or is not one
(an ignored placement means that the task is already running somewhere
on a node). It does this at the point where `failedTGAlloc` is
populated, so placement functionality isn't changed, just the field that
populates error.

There is functionality that should be preserved which (correctly)
notifies a user if a job is attempted that cannot be run on any node due
to the constraints filtering out all available nodes. This should still
behave as expected.

* Add changelog entry

* Handle in-place updates for constrained system jobs

* Update .changelog/25850.txt

Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>

* Remove conditionals

---------

Co-authored-by: Piotr Kazmierczak <470696+pkazmierczak@users.noreply.github.com>
2025-05-15 15:14:03 -07:00

4 lines
175 B
Plaintext

```release-note:bug
scheduler: Fixed a bug where planning or running a system job with constraints & previously running allocations would return a failed allocation error
```