nomad

mirror of https://github.com/kemko/nomad.git synced 2026-01-02 16:35:44 +03:00

Author	SHA1	Message	Date
Drew Bailey	12819975ee	remove log_writer prefix output with proper spacing update gzip handler, adjust first byte flow to allow gzip handler bypass wip, first stab at wiring up rpc endpoint	2019-11-05 09:51:48 -05:00
Drew Bailey	a828c92403	Display error when remote side ended monitor multisink logger remove usage of logwriter	2019-11-05 09:51:48 -05:00
Drew Bailey	74cfdf55bb	Adds nomad monitor command Adds nomad monitor command. Like consul monitor, this command allows you to stream logs from a nomad agent in real time with a a specified log level add endpoint tests Upgrade go-hclog to latest version The current version of go-hclog pads log prefixes to equal lengths so info becomes [INFO ] and debug becomes [DEBUG]. This breaks hashicorp/logutils/level.go Check function. Upgrading to the latest version removes this padding and fixes log filtering that uses logutils Check	2019-11-05 09:51:47 -05:00
Drew Bailey	dc3286481a	Add Agent Monitor to receive streaming logs Queries /v1/agent/monitor and receives streaming logs from client	2019-11-05 09:51:47 -05:00
Drew Bailey	91c0184773	Adds AgentMonitor Endpoint AgentMonitor is an endpoint to stream logs for a given agent. It allows callers to pass in a supplied log level, which may be different than the agents config allowing for temporary debugging with lower log levels. Pass in logWriter when setting up Agent	2019-11-05 09:51:46 -05:00
Drew Bailey	b20fb9e7bb	Merge pull request #6609 from hashicorp/b-alloc-status-consistency Prevent nomad alloc status output inconsistency	2019-11-04 10:12:04 -05:00
Drew Bailey	6980ab0a81	Prevent nomad alloc status output inconsistency Prevent random map ordering and sort alphabetically better variable name	2019-11-01 14:01:32 -04:00
Michael Schurter	0fcb0d4016	client: fix panic from 0.8 -> 0.10 upgrade makeAllocTaskServices did not do a nil check on AllocatedResources which causes a panic when upgrading directly from 0.8 to 0.10. While skipping 0.9 is not supported we intend to fix serious crashers caused by such upgrades to prevent cluster outages. I did a quick audit of the client package and everywhere else that accesses AllocatedResources appears to be properly guarded by a nil check.	2019-11-01 07:47:03 -07:00
Mahmood Ali	f010fe22fa	Merge pull request #6047 from hashicorp/b-ignore-server-if-disabled Only warn against BootstrapExpect set in CLI flag	2019-10-29 10:55:44 -04:00
Lang Martin	f042b5e296	quota: parse network stanza in quotas (#6511 )	2019-10-24 10:41:54 -04:00
Michael Schurter	d42ac815b4	Merge branch 'master' into release-0100	2019-10-22 08:17:57 -07:00
Nomad Release bot	25ee121d95	Generate files for 0.10.0 release	2019-10-22 12:34:56 +00:00
Seth Hoenig	8c7a7b6def	Merge pull request #6448 from hashicorp/f-set-connect-sidecar-tags connect: enable setting tags on consul connect sidecar service in job…	2019-10-17 15:14:09 -05:00
Seth Hoenig	b7e83591b4	connect: enable setting tags on consul connect sidecar service in jobspec (#6415 )	2019-10-17 19:25:20 +00:00
Mahmood Ali	31da091b57	Merge pull request #6427 from hashicorp/b-fs-endpoint-errors agent: report fs log errors as http errors	2019-10-15 20:12:59 -04:00
Mahmood Ali	5282353e22	tests: avoid using unnecessary pipe	2019-10-15 17:22:03 -04:00
Mahmood Ali	1064b9f71f	Merge pull request #6425 from hashicorp/f-cli-show-full-ids cli: show full id for single node or alloc status	2019-10-15 10:54:25 -04:00
Danielle	71ea45c205	Merge pull request #6331 from hashicorp/dani/f-volume-mount-propagation volumes: Add support for mount propagation	2019-10-14 14:29:40 +02:00
Danielle Lancashire	afb59bedf5	volumes: Add support for mount propagation This commit introduces support for configuring mount propagation when mounting volumes with the `volume_mount` stanza on Linux targets. Similar to Kubernetes, we expose 3 options for configuring mount propagation: - private, which is equivalent to `rprivate` on Linux, which does not allow the container to see any new nested mounts after the chroot was created. - host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts that have been created _outside of the container_ to be visible inside the container after the chroot is created. - bidirectional, which is equivalent to `rshared` on Linux, which allows both the container to see new mounts created on the host, but importantly _allows the container to create mounts that are visible in other containers an don the host_ private and host-to-task are safe, but bidirectional mounts can be dangerous, as if the code inside a container creates a mount, and does not clean it up before tearing down the container, it can cause bad things to happen inside the kernel. To add a layer of safety here, we require that the user has ReadWrite permissions on the volume before allowing bidirectional mounts, as a defense in depth / validation case, although creating mounts should also require a priviliged execution environment inside the container.	2019-10-14 14:09:58 +02:00
Danielle	15335be39e	Merge pull request #6429 from hashicorp/f-log-to-file Add support for logging to a file	2019-10-11 13:35:39 +02:00
Nomad Release bot	c49bf41779	Generate files for 0.10.0-rc1 release	2019-10-10 19:08:23 +00:00
Danielle Lancashire	567ad88165	logging: Correctly track number of written bytes Currently this assumes that a short write will never happen. While these are improbable in a case where rotation being off a few bytes would matter, this now correctly tracks the number of written bytes.	2019-10-10 14:02:14 +02:00
Danielle Lancashire	277a252ea4	logging: Sort files when pruning old logs Currently this logging implementation is dependent on the order of files as returned by filepath.Glob, which although internal methods are documented to be lexographical, does not publicly document this. Here we defensively resort.	2019-10-10 13:51:16 +02:00
Mahmood Ali	7a38784244	acl: check ACL against object namespace Fix a bug where a millicious user can access or manipulate an alloc in a namespace they don't have access to. The allocation endpoints perform ACL checks against the request namespace, not the allocation namespace, and performs the allocation lookup independently from namespaces. Here, we check that the requested can access the alloc namespace regardless of the declared request namespace. Ideally, we'd enforce that the declared request namespace matches the actual allocation namespace. Unfortunately, we haven't documented alloc endpoints as namespaced functions; we suspect starting to enforce this will be very disruptive and inappropriate for a nomad point release. As such, we maintain current behavior that doesn't require passing the proper namespace in request. A future major release may start enforcing checking declared namespace.	2019-10-08 12:59:22 -04:00
Mahmood Ali	e59cc7ce90	Merge pull request #6441 from hashicorp/b-agent-token Redact replication tokens in /agent/self	2019-10-08 12:55:45 -04:00
Danielle Lancashire	f72febd0b5	agent: Refactor log setup to support log-to-file	2019-10-07 14:42:32 +02:00
Danielle Lancashire	fff69a50e3	agent: Introduce File Logger This commit introduces a rotating file logger for Nomad Agent Logs. The logger implementation itself is a lift and shift from Consul, with tests updated to fit with the Nomad pattern of using require, and not having a testutil for creating tempdirs cleanly.	2019-10-07 14:37:31 +02:00
Danielle Lancashire	234d113a81	config: Add required configuration for logging to a file	2019-10-07 14:16:59 +02:00
Mahmood Ali	81422d410a	cli: show full id for single node or alloc status Show full ID on individual alloc or node status views. Shortening the ID isn't very helpful in these cases, and makes looking up the full id slightly more complicated when user needs to interact with API. List views are unmodified and show short id unless `-vebose` flag is passed. Before ``` $ nomad node status -self \| head -n2 ID = 21fc51f9 Name = mars-2.local $ nomad alloc status 15ae54cd \| head -n3 ID = 15ae54cd-08dd-3681-03cf-4c23ace7e7c3 Eval ID = a6b15f86 Name = example.cache[0] ``` After: ``` $ nomad node status -self \| head -n2 ID = 21fc51f9-fd39-0fa0-fb41-f34c7aa36101 Name = mars-2.local $ nomad alloc status 15ae54cd \| head -n3 ID = 15ae54cd-08dd-3681-03cf-4c23ace7e7c3 Eval ID = a6b15f86-ca8e-e536-b544-4bfb43137ff3 Name = example.cache[0] ```	2019-10-04 16:36:18 -04:00
Mahmood Ali	09ce0e5791	agent: report fs log errors as http errors This fixes two bugs: First, FS Logs API endpoint only propagated error back to user if it was encoded with code, which isn't common. Other errors get suppressed and callers get an empty response with 200 error code. Now, these endpoints return a 500 status code along with the error message. Before ``` $ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout"; echo * Trying 127.0.0.1... * TCP_NODELAY set * Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0) > GET /v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout HTTP/1.1 > Host: 127.0.0.1:4646 > User-Agent: curl/7.54.0 > Accept: / > < HTTP/1.1 200 OK < Vary: Accept-Encoding < Vary: Origin < Date: Fri, 04 Oct 2019 19:47:21 GMT < Content-Length: 0 < * Connection #0 to host 127.0.0.1 left intact ``` After ``` $ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout"; echo * Trying 127.0.0.1... * TCP_NODELAY set * Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0) > GET /v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout HTTP/1.1 > Host: 127.0.0.1:4646 > User-Agent: curl/7.54.0 > Accept: / > < HTTP/1.1 500 Internal Server Error < Vary: Accept-Encoding < Vary: Origin < Date: Fri, 04 Oct 2019 19:48:12 GMT < Content-Length: 60 < Content-Type: text/plain; charset=utf-8 < * Connection #0 to host 127.0.0.1 left intact alloc lookup failed: index error: UUID must be 36 characters ``` Second, we return 400 status code for request validation errors. Before ``` $ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera"; echo * Trying 127.0.0.1... * TCP_NODELAY set * Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0) > GET /v1/client/fs/logs/qwerqwera HTTP/1.1 > Host: 127.0.0.1:4646 > User-Agent: curl/7.54.0 > Accept: / > < HTTP/1.1 500 Internal Server Error < Vary: Accept-Encoding < Vary: Origin < Date: Fri, 04 Oct 2019 19:47:29 GMT < Content-Length: 22 < Content-Type: text/plain; charset=utf-8 < * Connection #0 to host 127.0.0.1 left intact must provide task name ``` After ``` $ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera"; echo * Trying 127.0.0.1... * TCP_NODELAY set * Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0) > GET /v1/client/fs/logs/qwerqwera HTTP/1.1 > Host: 127.0.0.1:4646 > User-Agent: curl/7.54.0 > Accept: / > < HTTP/1.1 400 Bad Request < Vary: Accept-Encoding < Vary: Origin < Date: Fri, 04 Oct 2019 19:49:18 GMT < Content-Length: 22 < Content-Type: text/plain; charset=utf-8 < * Connection #0 to host 127.0.0.1 left intact must provide task name ```	2019-10-04 16:33:58 -04:00
Lang Martin	c65c3fb50d	default raft protocol v2	2019-09-24 14:37:55 -04:00
Tim Gross	4f687cfc49	client/connect: ConsulProxy LocalServicePort/Address (#6358 ) Without a `LocalServicePort`, Connect services will try to use the mapped port even when delivering traffic locally. A user can override this behavior by pinning the port value in the `service` stanza but this prevents us from using the Consul service name to reach the service. This commits configures the Consul proxy with its `LocalServicePort` and `LocalServiceAddress` fields.	2019-09-23 14:30:48 -04:00
Danielle Lancashire	068c859237	api: Redact tokens in /agent/self	2019-09-23 19:07:27 +02:00
Danielle Lancashire	5851b2611d	api: Redact ACL Replication Token Currently when hitting the /v1/agent/self API with ACL Replication enabled results in the token being returned in the API. This commit redacts that information, as it should be treated as a shared secret.	2019-09-22 14:35:53 +02:00
Chris Baker	4b67fd89d4	fixed incorrect CLI documentation in `job deployments` listed `-all-allocs` instead of `-all`	2019-09-20 12:24:53 -05:00
Mahmood Ali	57850dd003	Merge pull request #6328 from hashicorp/b-gh-6269 cli: emit job version number proper	2019-09-17 19:06:44 -04:00
Tim Gross	6a9911d9aa	remove resolved TODO from UpdateTTL docstring (#6336 )	2019-09-16 16:26:06 -04:00
Mahmood Ali	48034051d4	cli: emit job version number proper We must emit alloc job number rather than its the field address.	2019-09-13 19:04:32 -04:00
Danielle Lancashire	ab5ba7aa9b	config: Hoist volume.config.source into volume Currently, using a Volume in a job uses the following configuration: ``` volume "alias-name" { type = "volume-type" read_only = true config { source = "host_volume_name" } } ``` This commit migrates to the following: ``` volume "alias-name" { type = "volume-type" source = "host_volume_name" read_only = true } ``` The original design was based due to being uncertain about the future of storage plugins, and to allow maxium flexibility. However, this causes a few issues, namely: - We frequently need to parse this configuration during submission, scheduling, and mounting - It complicates the configuration from and end users perspective - It complicates the ability to do validation As we understand the problem space of CSI a little more, it has become clear that we won't need the `source` to be in config, as it will be used in the majority of cases: - Host Volumes: Always need a source - Preallocated CSI Volumes: Always needs a source from a volume or claim name - Dynamic Persistent CSI Volumes: Always needs a source to attach the volumes to for managing upgrades and to avoid dangling. - Dynamic Ephemeral CSI Volumes: Less thought out, but `source` will probably point to the plugin name, and a `config` block will allow you to pass meta to the plugin. Or will point to a pre-configured ephemeral config. *If implemented The new design simplifies this by merging the source into the volume stanza to solve the above issues with usability, performance, and error handling.	2019-09-13 04:37:59 +02:00
Mahmood Ali	483b10ab0e	fix 'nomad namespace apply' help Named arguments need to preceed positional arguments.	2019-09-09 10:04:41 -07:00
Nomad Release bot	7df4da75f7	Generate files for 0.10.0-beta1 release	2019-09-06 18:47:09 +00:00
Michael Schurter	115700155e	Merge pull request #6282 from hashicorp/f-connect-dev-path connect: check if consul is on PATH	2019-09-05 12:25:23 -07:00
Michael Schurter	590e805588	connect: check if consul is on PATH Only in -dev-connect mode for now since its valid to install Consul after Nomad has started in production.	2019-09-05 12:05:42 -07:00
Jasmine Dahilig	50c515ab6f	add validation for job_gc_interval (#6277 )	2019-09-05 11:20:46 -07:00
Mahmood Ali	e66239d353	Merge pull request #6250 from hashicorp/f-raft-protocol-v3 Update default raft protocol to version 3	2019-09-04 09:34:41 -04:00
Tim Gross	40368d2c63	support script checks for task group services (#6197 ) In Nomad prior to Consul Connect, all Consul checks work the same except for Script checks. Because the Task being checked is running in its own container namespaces, the check is executed by Nomad in the Task's context. If the Script check passes, Nomad uses the TTL check feature of Consul to update the check status. This means in order to run a Script check, we need to know what Task to execute it in. To support Consul Connect, we need Group Services, and these need to be registered in Consul along with their checks. We could push the Service down into the Task, but this doesn't work if someone wants to associate a service with a task's ports, but do script checks in another task in the allocation. Because Nomad is handling the Script check and not Consul anyways, this moves the script check handling into the task runner so that the task runner can own the script check's configuration and lifecycle. This will allow us to pass the group service check configuration down into a task without associating the service itself with the task. When tasks are checked for script checks, we walk back through their task group to see if there are script checks associated with the task. If so, we'll spin off script check tasklets for them. The group-level service and any restart behaviors it needs are entirely encapsulated within the group service hook.	2019-09-03 15:09:04 -04:00
Jasmine Dahilig	c346a47b5b	add default update stanza and max_parallel=0 disables deployments (#6191 )	2019-09-02 10:30:09 -07:00
Evan Ercolano	859861817d	Remove unused canary param from MakeTaskServiceID	2019-08-31 16:53:23 -04:00
Michael Schurter	c783505582	Merge pull request #6236 from hashicorp/b-ignore-connect-services consul: ignore connect services when syncing	2019-08-30 13:11:09 -07:00
Michael Schurter	c5023d2cdb	consul: ignore connect services when syncing Consul registers Connect services automatically, however Nomad thinks it owns them due to the _nomad prefix. Since the services are managed by Consul, Nomad needs to explicitly ignore them or otherwies they will be removed.	2019-08-30 11:53:41 -07:00

1 2 3 4 5 ...

2357 Commits