Commit Graph

21933 Commits

Author SHA1 Message Date
Michael Schurter
ba7694855d Merge pull request #11334 from hashicorp/f-chroot-skip-allocdir
client: never embed alloc_dir in chroot
2021-11-03 16:48:09 -07:00
Jamie Finnigan
2df000b5b9 Remove example associated with deprecated nomad-spark (#11441) 2021-11-03 16:44:26 -07:00
Mike Nomitch
0f3938b9ac Merge pull request #11440 from sara-gawlinski/patch-1
Update alert-banner.js
2021-11-03 16:18:09 -07:00
sara-gawlinski
23fce2efda Update alert-banner.js
Add banner for Nomad Packs for Pack contest
2021-11-03 17:15:28 -05:00
Kevin Wang
dd218ff6c2 chore: react-subnav (#11437) 2021-11-03 17:06:38 -04:00
Charlie Voiselle
6efa55e7aa Parse job > group > consul block in HCL1 (#11423) 2021-11-03 13:49:32 -04:00
Luiz Aoqui
0e3cd604d2 docs: update podman driver documentation (#11300) 2021-11-03 11:07:44 -04:00
Luiz Aoqui
89447fb42e Revert "Return SchedulerConfig instead of SchedulerConfigResponse struct (#10799)" (#11433) 2021-11-02 17:42:52 -04:00
James Rasell
394cf3ce46 Merge pull request #11425 from hashicorp/b-add-timeout-consul-docs
docs: document Consul timeout config parameter.
2021-11-02 15:28:34 +01:00
James Rasell
6fc22f423d Merge pull request #11430 from hashicorp/f-update-script-install-versions
scripts: install latest version of Consul and Vault.
2021-11-02 15:21:43 +01:00
James Rasell
bbf71af984 scripts: install latest version of Consul and Vault. 2021-11-02 13:37:03 +01:00
James Rasell
8ba1444828 Merge pull request #11411 from hashicorp/f-gh-11406
cli: add json and template flag opts to acl bootstrap command.
2021-11-02 09:48:25 +01:00
James Rasell
6daf5db3a9 docs: document Consul timeout config parameter. 2021-11-02 08:28:45 +01:00
Charlie Voiselle
4cfc6a0ed6 Making RPC Upgrade mode reloadable. (#11144)
- Making RPC Upgrade mode reloadable.
- Add suggestions from code review
- remove spurious comment
- switch to require(t,...) form for test.
- Add to changelog
2021-11-01 16:30:53 -04:00
Luiz Aoqui
ffa4d07c66 Allow using specific object ID on diff (#11400) 2021-11-01 15:16:31 -04:00
James Rasell
906d94c76f Merge pull request #11408 from VaultVulp/patch-1
Fix typo in documentation
2021-10-29 09:44:58 +02:00
James Rasell
9542b2e126 changelog: add entry for #11411. 2021-10-29 09:08:10 +02:00
James Rasell
35a6d76d4d docs: update acl bootstrap command to show json and template opts. 2021-10-29 09:01:58 +02:00
James Rasell
d02ecbddf1 cli: add json and template flag opts to acl boostrap command. 2021-10-29 09:00:50 +02:00
Pavel Alimpiev
dac525cc69 Fix typo in documentation 2021-10-29 03:31:53 +03:00
James Rasell
94dec4baf8 Merge pull request #11338 from hashicorp/f-expose-nomad-consul-vagrant-linux
vagrantfile: expose Nomad and Consul APIs to local machine.
2021-10-28 14:13:16 +02:00
Dave May
f46b97b2df debug: update default node-id and docs (#11398)
* debug: default node-id to all
* debug: align cli help and website documentation
2021-10-27 13:43:56 -04:00
Mahmood Ali
3ce89b75f4 logging: Log the cause behind agent startup failure (#11353)
Log the failure error when the agent fails to start. Previously, the
agent startup failure error would be emitted to the command UI but not
logged. So it doesn't get emitted to syslog or `log_file` if they are
set, and it makes debugging much harder. Also, logging the error again
before exit makes the error more visible: previously, the operator
needed to scroll to the top to find the error.

On a sample failure, the output will look like:
```
==> WARNING: Bootstrap mode enabled! Potentially unsafe operation.
==> Loaded configuration from sample-configs/config-bad
==> Starting Nomad agent...
==> Error starting agent: setting up server node ID failed: mkdir /path-without-permission: read-only file system
    2021-10-20T14:38:51.179-0400 [WARN]  agent.plugin_loader: skipping external plugins since plugin_dir doesn't exist: plugin_dir=/path-without-permission/plugins
    2021-10-20T14:38:51.181-0400 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/path-without-permission/plugins
    2021-10-20T14:38:51.181-0400 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/path-without-permission/plugins
    2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=java type=driver plugin_version=0.1.0
    2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=docker type=driver plugin_version=0.1.0
    2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=mock_driver type=driver plugin_version=0.1.0
    2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
    2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=exec type=driver plugin_version=0.1.0
    2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
    2021-10-20T14:38:51.181-0400 [ERROR] agent: error starting agent: error="setting up server node ID failed: mkdir /path-without-permission: read-only file system"
```

This change adds the final `ERROR` message. It's easy to miss the `==>
Error starting agent` above.
2021-10-27 10:41:17 -07:00
Mike Nomitch
3025ae6087 Replaces accidental use of Vault with Nomad (#11355) 2021-10-27 08:35:31 -07:00
Mahmood Ali
54ca97fee8 vault: set JobID in Vault metadata (#11397)
Closes: #11395 .
2021-10-27 07:20:29 -07:00
Mahmood Ali
56a7cc61d0 scheduler: stop allocs in unrelated nodes (#11391)
The system scheduler should leave allocs on draining nodes as-is, but
stop node stop allocs on nodes that are no longer part of the job
datacenters.

Previously, the scheduler did not make the distinction and left system
job allocs intact if they are already running.

I've added a failing test first, which you can see in https://app.circleci.com/jobs/github/hashicorp/nomad/179661 .

Fixes https://github.com/hashicorp/nomad/issues/11373
2021-10-27 07:04:13 -07:00
Mahmood Ali
5f6ad87ca8 Fix arm64 panics by updating google/snappy library to latest, 0.0.4 (#11396)
Pick up https://github.com/golang/snappy/pull/56 to handle arm64 architectures to fix panics. tldr; Golang 1.16 changed `memmove` implementation for arm64 requiring additional cpu registers that snappy wasn't preserving in its assembly implementation.

Other projects have experienced this issue as well, searching for `encode_arm64.s:666` on your favorite search engine will reveal some.  Vault updated the dependency earlier this August: https://github.com/hashicorp/vault/pull/12371 .

I believe this issue affects Nomad 1.2.x and 1.1.x. Nomad 1.0.x use Golang 1.15 and isn't affected. However, backporting the change to 1.0.x should be harmless.

Fixed https://github.com/hashicorp/nomad/issues/11385 .
2021-10-27 06:39:16 -07:00
James Rasell
6c7cedf59b vagrantfile: expose Nomad and Consul APIs to local machine. 2021-10-27 12:15:37 +02:00
Luiz Aoqui
796a91b0be prevent active log from being overwritten when agent starts (#11386) 2021-10-26 20:57:07 -04:00
Luiz Aoqui
1fbe88fbd6 docs: add note and example of storing nomad job plan index to disk (#11377) 2021-10-26 20:25:22 -04:00
Charlie Voiselle
dce23e829f DOCS: Update Consul Connect to Consul service mesh (#11362)
* Update Consul Connect to Consul service mesh
* Apply suggestions from code review
2021-10-26 15:10:21 -04:00
Noel Quiles
8a35232704 website: Add Fathom analytics (#11276)
* Impl Fathom analytics

* Actually install fathom-client

* Use analytics package instead of direct impl

* Remove explicit fathom-client dep

* Upgrade platform analytics package
2021-10-25 15:23:38 -04:00
Luiz Aoqui
4c10d4740d ui: update task group alloc summary chart to use new SummaryLegendItem component (#11375) 2021-10-25 11:14:01 -04:00
Luiz Aoqui
10d3056d43 fix test names (#11374) 2021-10-22 15:43:55 -04:00
Luiz Aoqui
8c799b3980 add dispatch idempotency token support in the CLI (#10930) 2021-10-22 12:39:05 -04:00
Luiz Aoqui
f729ba5df4 ui: persist node drain settings (#11368) 2021-10-22 10:51:31 -04:00
Luiz Aoqui
f88dc37fc2 ui: display Nomad version in the Clients and Servers table (#11366) 2021-10-22 10:33:06 -04:00
Luiz Aoqui
5653d28c23 ui: use get to access job meta value (#11370) 2021-10-22 10:05:48 -04:00
Luiz Aoqui
ae3d059f96 ui: update favicon (#11371) 2021-10-22 09:40:38 -04:00
Luiz Aoqui
82a3ae7b40 cli: allow setting namespace and region in the nomad ui command (#11364) 2021-10-21 16:24:39 -04:00
Luiz Aoqui
d599c63c9c ui: create tooltip component (#11363) 2021-10-21 13:12:33 -04:00
Luiz Aoqui
f757f619f3 ui: set * as the default namespace selector (#11357) 2021-10-21 10:24:07 -04:00
Luiz Aoqui
6412e77cbc ui: add client name tooltip when displaying client ID in tables (#11358) 2021-10-21 10:23:06 -04:00
James Rasell
dc49869c29 Merge pull request #11339 from hashicorp/b-website-fixup-interpolation-formatting
website: fixup link formatting within interpolation doc.
2021-10-21 09:15:36 +02:00
Mahmood Ali
d43bf779b7 document GH-11346 fix (#11350) 2021-10-20 22:03:19 -04:00
Brandon Romano
7604465994 Merge pull request #11356 from hashicorp/update-alert-banner
Update HashiConf alert-banner expiration
2021-10-20 16:28:30 -07:00
Brandon Romano
af3eb43538 Update HashiConf alert-banner expiration
Updates the HashiConf Alert Banner expiration to 10/20 @ 11pm (PT)
2021-10-20 16:02:45 -07:00
Michael Schurter
13cc8b3c4a Merge pull request #11331 from shishir-a412ed/init
Add support for --init to docker driver.
2021-10-20 10:49:51 -07:00
Michael Schurter
fceb6cea2f Merge pull request #11347 from shishir-a412ed/cleanup
Code cleanup: Remove extra if clause.
2021-10-20 09:37:10 -07:00
Mahmood Ali
6d35e2fb58 Fix preemption panic (#11346)
Fix a bug where the scheduler may panic when preemption is enabled. The conditions are a bit complicated:
A job with higher priority that schedule multiple allocations that preempt other multiple allocations on the same node, due to port/network/device assignments.

The cause of the bug is incidental mutation of internal cached data. `RankedNode` computes and cache proposed allocations  in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L42-L53 . But scheduler then mutates the list to remove pre-emptable allocs in https://github.com/hashicorp/nomad/blob/v1.1.6/scheduler/rank.go#L293-L294, and  `RemoveAllocs` mutates and sets the tail of cached slice with `nil`s triggering a nil-pointer derefencing case.

I fixed the issue by avoiding the mutation in `RemoveAllocs` - the micro-optimization there doesn't seem necessary.

Fixes https://github.com/hashicorp/nomad/issues/11342
2021-10-19 20:22:03 -04:00