Commit Graph

21698 Commits

Author SHA1 Message Date
James Rasell
ecb6e63383 Merge pull request #11042 from hashicorp/docs-remove-ingress-host-port-callout
docs: Remove note on ingress gateway hosts field needing a port number
2021-08-25 12:29:59 +02:00
James Rasell
cee12859c6 Merge pull request #11072 from kushsharma/patch-1
docs: fix typo in structs/event.go
2021-08-25 11:33:27 +02:00
Luiz Aoqui
d74ab11d25 Don't timestamp active log file (#11070)
* don't timestamp active log file

* website: update log_file default value

* changelog: add entry for #11070

* website: add upgrade instructions for log_file in v1.14 and v1.2.0
2021-08-23 11:27:34 -04:00
Kush
b1913b8e92 docs: fix typo in structs/event.go 2021-08-21 17:02:07 +05:30
Zachary Shilton
26e381f196 Upgrade global styles (#10936)
* website: upgrade global-styles packages

* website: upgrade community page

* website: hide alert-banner on mobile

* website: upgrade g-container to g-grid-container

* website: update /security to use markdown-page

* website: fix unsupported prop

* website: fix incorrect github link in security page

* website: bump to latest patched dependencies
2021-08-20 11:53:12 -04:00
Mahmood Ali
1403a06b99 Merge pull request #11064 from hashicorp/deflake-tests-20210818
Deflake tests attempts
2021-08-19 09:05:12 -04:00
Mahmood Ali
c66d2a4167 tests: attempt deflaking TestAutopilot_CleanupDeadServer
Attempt to deflake the test by avoiding shutting down the leaders, as leadership
recovery takes more time, and consequently longer to process raft configuration
changes and potentially failing the test.
2021-08-18 15:37:25 -04:00
Mahmood Ali
794a08cc26 tests: deflake TestLeader_LeftLeader
Wait for leadership to be established before killing leader.
2021-08-18 14:19:00 -04:00
Mahmood Ali
327d461b12 Consider all system jobs for a new node (#11054)
When a node becomes ready, create an eval for all system jobs across
namespaces.

The previous code uses `job.ID` to deduplicate evals, but that ignores
the job namespace. Thus if there are multiple jobs in different
namespaces sharing the same ID/Name, only one will be considered for
running in the new node. Thus, Nomad may skip running some system jobs
in that node.
2021-08-18 09:50:37 -04:00
Mahmood Ali
2bb03170b6 e2e: Run system jobs on all datacenters (#11060)
Target all e2e datacenters for system and sysbatch e2e tests.  They
require that the system jobs run on all linux clients.

However, the jobs currenly only target `dc1` datacenter, but the nightly
e2e cluster has 4 clients spread in `dc1` and `dc2` datacenters, causing
the tests to fail.

I missed this problem in e2e dev cluster because it only used a single
dc1 datacenter.
2021-08-17 11:01:47 -04:00
Mahmood Ali
fdb8684004 Merge pull request #9160 from hashicorp/f-sysbatch
core: implement system batch scheduler
2021-08-16 09:30:24 -04:00
James Rasell
91fb72fefa Merge pull request #11051 from hashicorp/b-gh-11047
tlsutil: update testing certificates close to expiry.
2021-08-16 09:42:01 +02:00
James Rasell
530c0f8448 tlsutil: update testing certificates close to expiry. 2021-08-13 11:09:40 +02:00
Blake Covarrubias
291bbd7b9e docs: Remove note on ingress gateway hosts field needing a port number
Update the ingress gateway documentation to remove the note stating
that a port must be specified for values in the `hosts` field when
the ingress gateway is listening on a non-standard HTTP port.

Specifying a port was required in Consul 1.8.0, but that requirement
was removed in 1.8.1 with hashicorp/consul#8190 which made Consul
include the port number when constructing the Envoy configuration.

Related Consul docs PR: hashicorp/consul#10827
2021-08-11 14:55:05 -07:00
Mahmood Ali
499fcebc42 docs: Consul Connect tweaks (#11040)
Tweaks to the commands in Consul Connect page.

For multi-command scripts, having the leading `$` is a bit annoying, as it makes copying the text harder. Also, the `copy` button would only copy the first command and ignore the rest.

Also, the `echo 1 > ...` commands are required to run as root, unlike the rest! I made them use `| sudo tee` pattern to ease copy & paste as well.

Lastly, update the CNI plugin links to 1.0.0. It's fresh off the oven - just got released less than an hour ago: https://github.com/containernetworking/plugins/releases/tag/v1.0.0 .
2021-08-11 17:14:26 -04:00
Mahmood Ali
eedfa0b381 Merge pull request #11034 from tgross/docs-cni-install
docs: note CNI requirement for bridge networking
2021-08-11 11:07:25 -04:00
Tim Gross
d6a37a68ff docs: note CNI requirement for bridge networking
Using `bridge` networking requires that you have CNI plugins installed
on the client, but this isn't in the jobspec `network` docs which are
the first place someone will look when trying to configure task
networking.
2021-08-11 10:18:35 -04:00
Michael Schurter
734f1d7eb6 Merge pull request #10848 from ggriffiths/listsnapshot_secrets
CSI Listsnapshot secrets support
2021-08-10 15:59:33 -07:00
Mahmood Ali
aad6d401cf system: re-evaluate node on feasibility changes (#11007)
Fix a bug where system jobs may fail to be placed on a node that
initially was not eligible for system job placement.

This changes causes the reschedule to re-evaluate the node if any
attribute used in feasibility checks changes.

Fixes https://github.com/hashicorp/nomad/issues/8448
2021-08-10 17:17:44 -04:00
Mahmood Ali
dfb313a6da deployments: canary=0 is implicitly autopromote (#11013)
In a multi-task-group job, treat 0 canary groups as auto-promote.

This change fixes an edge case where Nomad requires a manual promotion,
if the job had any group with canary=0 and rest of groups having
auto_promote set.

Co-authored-by: Michael Schurter <mschurter@hashicorp.com>
2021-08-10 17:06:40 -04:00
Mahmood Ali
b1d10ff69d Speed up client startup and registration (#11005)
Speed up client startup, by retrying more until the servers are known.

Currently, if client fingerprinting is fast and finishes before the
client connect to a server, node registration may be delayed by 15
seconds or so!

Ideally, we'd wait until the client discovers the servers and then retry
immediately, but that requires significant code changes.

Here, we simply retry the node registration request every second. That's
basically the equivalent of check if the client discovered servers every
second. Should be a cheap operation.

When testing this change on my local computer and where both servers and
clients are co-located, the time from startup till node registration
dropped from 34 seconds to 8 seconds!
2021-08-10 17:06:18 -04:00
Luiz Aoqui
f59fa9850a ui: add missing pipe separator in parameterized and periodic jobs (#11020) 2021-08-10 13:48:20 -04:00
Michael Schurter
c6d7ec0530 Merge pull request #10995 from miao1007/patch-1
docs: Add replication_token link with authoritative_region
2021-08-10 10:48:02 -07:00
Jai
c86f30d7f1 Merge pull request #10666 from hashicorp/b-ui/search-namespaces
ui: Fix fuzzy search namespace-handling
2021-08-10 13:13:20 -04:00
Mike Wickett
46737ea2d8 chore: update alert banner (#11022) 2021-08-10 12:56:13 -04:00
Jai Bhagat
00df085e71 edit hierarchy to lead with namespace before job 2021-08-10 10:35:36 -04:00
Luiz Aoqui
8e33fc5503 ui: only dipslay "Dispatch Job" button on parameterized jobs (#11019) 2021-08-09 17:49:08 -04:00
Luiz Aoqui
ff53af3f13 make: embed the Nomad UI data by default (#11018) 2021-08-09 16:53:44 -04:00
Lir (Rookout)
de8c69dad3 Some Rookout docs tweaks (#10989) 2021-08-09 11:19:36 +02:00
Michael Schurter
1c1a23305f Merge pull request #10951 from hashicorp/b-cn-proxy
consul/connect: avoid warn messages on connect proxy errors
2021-08-06 15:25:40 -07:00
Michael Schurter
930b8b6d85 Merge pull request #11010 from hashicorp/docs-10875
docs: add backward incompat note about #10875
2021-08-06 08:28:48 -07:00
Michael Schurter
2ffb7e1397 docs: add backward incompat note about #10875
Fixes #11002
2021-08-05 15:08:55 -07:00
James Rasell
892a476ff2 Merge pull request #11006 from hashicorp/f-gh-10929-changelog
changelog: add entry for #10929
2021-08-05 17:32:19 +02:00
James Rasell
a946419adc consul/connect: avoid warn messages on connect proxy errors
When creating a TCP proxy bridge for Connect tasks, we are at the
mercy of either end for managing the connection state. For long
lived gRPC connections the proxy could reasonably expect to stay
open until the context was cancelled. For the HTTP connections used
by connect native tasks, we experience connection disconnects.
The proxy gets recreated as needed on follow up requests, however
we also emit a WARN log when the connection is broken. This PR
lowers the WARN to a TRACE, because these disconnects are to be
expected.

Ideally we would be able to proxy at the HTTP layer, however Consul
or the connect native task could be configured to expect mTLS, preventing
Nomad from MiTM the requests.

We also can't mange the proxy lifecycle more intelligently, because
we have no control over the HTTP client or server and how they wish
to manage connection state.

What we have now works, it's just noisy.

Fixes #10933
2021-08-05 11:27:35 +02:00
James Rasell
345c9737a9 changelog: add entry for #10929 2021-08-05 10:48:36 +02:00
James Rasell
fa36aac653 Merge pull request #10929 from AchilleAsh/fix-token-docker-auth-config
fix: load token in docker auth config
2021-08-05 10:44:39 +02:00
Luiz Aoqui
c67b69bd0c changelog: add entry for #10934 (#11001) 2021-08-04 11:33:18 -04:00
James Rasell
36337af295 Merge pull request #10996 from hashicorp/b-fix-doublespace-general-cli-opts
cli: fix minor format error within `-ca-cert` help text.
2021-08-04 09:21:19 +02:00
Luiz Aoqui
332dc88101 ui: fix job dispatch page when job doesn't have any meta fields (#10934) 2021-08-03 13:50:43 -04:00
Mahmood Ali
141ea605f7 e2e: fix tests
Use basic sleeps in busybox images. busybox are very light, and ping has
permissions complications, and it may fail for network related
issues.
2021-08-03 11:38:35 -04:00
Seth Hoenig
61ee443ee6 core: implement system batch scheduler
This PR implements a new "System Batch" scheduler type. Jobs can
make use of this new scheduler by setting their type to 'sysbatch'.

Like the name implies, sysbatch can be thought of as a hybrid between
system and batch jobs - it is for running short lived jobs intended to
run on every compatible node in the cluster.

As with batch jobs, sysbatch jobs can also be periodic and/or parameterized
dispatch jobs. A sysbatch job is considered complete when it has been run
on all compatible nodes until reaching a terminal state (success or failed
on retries).

Feasibility and preemption are governed the same as with system jobs. In
this PR, the update stanza is not yet supported. The update stanza is sill
limited in functionality for the underlying system scheduler, and is
not useful yet for sysbatch jobs. Further work in #4740 will improve
support for the update stanza and deployments.

Closes #2527
2021-08-03 10:30:47 -04:00
James Rasell
0ba2086782 cli: fix minor format error within -ca-cert help text. 2021-08-03 16:05:06 +02:00
みゃお
501444c5da [doc]Add replication_token link with authoritative_region
replication_token always works together with authoritative_region, add a link for better doc.
2021-08-03 18:56:00 +08:00
Mahmood Ali
52c37e16aa Only initialize task.VolumeMounts when not-nil (#10990)
1.1.3 had a bug where task.VolumeMounts will be an empty slice instead of nil. Eventually, it gets canonicalized and is set to `nil`, but it seems to confuse dry-run planning.

The regression was introduced in https://github.com/hashicorp/nomad/pull/10855/files#diff-56b3c82fcbc857f8fb93a903f1610f6e6859b3610a4eddf92bad9ea27fdc85ecL1028-R1037 . Curiously, it's the only place where `len(apiTask.VolumeMounts)` check was dropped. I assume it was dropped accidentally.

Fixes #10981
2021-08-02 13:08:10 -04:00
Mike Wickett
3a346f9a13 website: update consent manager (#10977) 2021-08-02 12:56:20 -04:00
Derek Strickland
7f1748b37d Merge pull request #10976 from itorres/api-docs-allocation-restart-sample
API docs: Fix allocation restart example
2021-08-02 08:48:45 -04:00
James Rasell
694dda28a1 Merge pull request #10987 from hashicorp/f-docs-order-external-drivers-alphabetically
docs: order external driver overview alphabetically.
2021-08-02 12:50:16 +02:00
James Rasell
8adb00bfad docs: order external driver overview alphabetically. 2021-08-02 10:51:37 +02:00
Lir (Rookout)
9b65172d7b Rookout driver docs (#10950)
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
2021-08-02 10:09:45 +02:00
Mahmood Ali
6d00c688b3 Merge pull request #10969 from hashicorp/merge-release-1.1.3
Merge release 1.1.3
2021-07-30 11:23:25 -04:00