Commit Graph

57 Commits

Author SHA1 Message Date
Mahmood Ali
2f4b9da61a drivers/exec: test all cgroups are destroyed 2019-12-11 11:12:29 -05:00
Danielle Lancashire
afb59bedf5 volumes: Add support for mount propagation
This commit introduces support for configuring mount propagation when
mounting volumes with the `volume_mount` stanza on Linux targets.

Similar to Kubernetes, we expose 3 options for configuring mount
propagation:

- private, which is equivalent to `rprivate` on Linux, which does not allow the
           container to see any new nested mounts after the chroot was created.

- host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts
                that have been created _outside of the container_ to be visible
                inside the container after the chroot is created.

- bidirectional, which is equivalent to `rshared` on Linux, which allows both
                 the container to see new mounts created on the host, but
                 importantly _allows the container to create mounts that are
                 visible in other containers an don the host_

private and host-to-task are safe, but bidirectional mounts can be
dangerous, as if the code inside a container creates a mount, and does
not clean it up before tearing down the container, it can cause bad
things to happen inside the kernel.

To add a layer of safety here, we require that the user has ReadWrite
permissions on the volume before allowing bidirectional mounts, as a
defense in depth / validation case, although creating mounts should also require
a priviliged execution environment inside the container.
2019-10-14 14:09:58 +02:00
Nick Ethier
2f16eb9640 executor: run exec commands in netns if set 2019-09-30 11:50:22 -04:00
Nick Ethier
1ff85f09f3 executor: cleanup netns handling in executor 2019-07-31 01:04:05 -04:00
Nick Ethier
e26192ad49 Driver networking support
Adds support for passing network isolation config into drivers and
implements support in the rawexec driver as a proof of concept
2019-07-31 01:03:20 -04:00
Mahmood Ali
c72bf13f8a exec: use an independent name=systemd cgroup path
We aim for containers to be part of a new cgroups hierarchy independent
from nomad agent.  However, we've been setting a relative path as
libcontainer `cfg.Cgroups.Path`, which makes libcontainer concatinate
the executor process cgroup with passed cgroup, as set in [1].

By setting an absolute path, we ensure that all cgroups subsystem
(including `name=systemd` get a dedicated one).  This matches behavior
in Nomad 0.8, and behavior of how Docker and OCI sets CgroupsPath[2]

Fixes #5736

[1] d7edf9b2e4/vendor/github.com/opencontainers/runc/libcontainer/cgroups/fs/apply_raw.go (L326-L340)
[2] 238f8eaa31/vendor/github.com/containerd/containerd/oci/spec.go (L229)
2019-06-10 22:00:12 -04:00
Mahmood Ali
1a6454d242 special case root capabilities 2019-05-24 14:10:10 -04:00
Mahmood Ali
82611af925 drivers/exec: Restore 0.8 capabilities
Nomad 0.9 incidentally set effective capabilities that is higher than
what's expected of a `nobody` process, and what's set in 0.8.

This change restores the capabilities to ones used in Nomad 0.9.
2019-05-20 13:11:29 -04:00
Lang Martin
568a120e7b Merge pull request #5649 from hashicorp/b-lookup-exe-chroot
lookup executables inside chroot
2019-05-17 15:07:41 -04:00
Mahmood Ali
7f76aedfae use pty/tty terminology similar to github.com/kr/pty 2019-05-10 19:17:14 -04:00
Mahmood Ali
976bfbc41a executors: implement streaming exec
Implements streamign exec handling in both executors (i.e. universal and
libcontainer).

For creation of TTY, some incidental complexity leaked in.  The universal
executor uses github.com/kr/pty for creation of TTYs.

On the other hand, libcontainer expects a console socket and for libcontainer to
create the underlying console object on process start.  The caller can then use
`libcontainer.utils.RecvFd()` to get tty master end.

I chose github.com/kr/pty for managing TTYs here.  I tried
`github.com/containerd/console` package (which is already imported), but the
package did not work as expected on macOS.
2019-05-10 19:17:14 -04:00
Lang Martin
a9da4bac11 executor_linux only do path resolution in the taskDir, not local
split out lookPathIn to show it's similarity to exec.LookPath
2019-05-10 11:33:35 -04:00
Lang Martin
0c4a45fa16 executor_linux pass the command to lookupTaskBin to get path 2019-05-08 10:01:20 -04:00
Lang Martin
9688710c10 executor/* Launch log at top of Launch is more explicit, trace 2019-05-07 17:01:05 -04:00
Lang Martin
538f387d8d move lookupTaskBin to executor_linux, for os dependency clarity 2019-05-07 16:58:27 -04:00
Lang Martin
7a2fdf7a2d executor and executor_linux debug launch prep and process start 2019-05-03 14:42:57 -04:00
Lang Martin
83930dcb9c executor_linux call new lookupTaskBin 2019-05-03 11:55:19 -04:00
Mahmood Ali
9bf54eae97 comment what refer to 2019-04-19 09:49:04 -04:00
Mahmood Ali
b6af5c9dca Move libcontainer helper to executor package 2019-04-19 09:49:04 -04:00
Michael Schurter
56048bda0a executor/linux: comment this bizarre code 2019-04-02 11:25:45 -07:00
Michael Schurter
21e895e2e7 Revert "executor/linux: add defensive checks to binary path"
This reverts commit cb36f4537e.
2019-04-02 11:17:12 -07:00
Michael Schurter
cb36f4537e executor/linux: add defensive checks to binary path 2019-04-02 09:40:53 -07:00
Michael Schurter
254901a51e executor/linux: make chroot binary paths absolute
Avoid libcontainer.Process trying to lookup the binary via $PATH as the
executor has already found where the binary is located.
2019-04-01 15:45:31 -07:00
Mahmood Ali
a0d025e90d Revert "executor: synchronize exitState accesses" (#5449)
Reverts hashicorp/nomad#5433

Apparently, channel communications can constitute Happens-Before even for proximate variables, so this syncing isn't necessary.

> _The closing of a channel happens before a receive that returns a zero value because the channel is closed._
https://golang.org/ref/mem#tmp_7
2019-03-20 07:33:05 -04:00
Nick Ethier
d9d90fa5f0 Merge pull request #5429 from hashicorp/b-blocking-executor-shutdown
executor: block shutdown on process exiting
2019-03-19 15:18:01 -04:00
Mahmood Ali
989175fc59 executor: synchronize exitState accesses
exitState is set in `wait()` goroutine but accessed in a different
`Wait()` goroutine, so accesses must be synchronized by a lock.
2019-03-17 11:56:58 -04:00
Nick Ethier
c2c984ea50 executor: block shutdown on process exiting 2019-03-15 23:50:17 -04:00
Iskander (Alex) Sharipov
7cf58d08c1 drivers/shared/executor: fix strings.Replace call
strings.Replace call with n=0 argument makes no sense
as it will do nothing. Probably -1 is intended.

Signed-off-by: Iskander Sharipov <quasilyte@gmail.com>
2019-03-02 00:33:17 +03:00
Mahmood Ali
bd90b8db77 CVE-2019-5736: Update libcontainer depedencies (#5334)
* CVE-2019-5736: Update libcontainer depedencies

Libcontainer is vulnerable to a runc container breakout, that was
reported as CVE-2019-5736[1].  Upgrading vendored libcontainer with the fix.

The runc changes are captured in 369b920277 .

[1] https://seclists.org/oss-sec/2019/q1/119
2019-02-19 20:21:18 -05:00
Mahmood Ali
9f7619344e Merge pull request #5190 from hashicorp/f-memory-usage
Track Basic Memory Usage as reported by cgroups
2019-01-18 16:46:02 -05:00
Alex Dadgar
109c5ef650 Merge pull request #5173 from hashicorp/b-log-levels
Plugins use parent loggers
2019-01-14 16:14:30 -08:00
Mahmood Ali
b5c20aa50b Track Basic Memory Usage as reported by cgroups
Track current memory usage, `memory.usage_in_bytes`, in addition to
`memory.max_memory_usage_in_bytes` and friends.  This number is closer
what Docker reports.

Related to https://github.com/hashicorp/nomad/issues/5165 .
2019-01-14 18:47:52 -05:00
Nick Ethier
fbf9a4c772 executor: implement streaming stats API
plugins/driver: update driver interface to support streaming stats

client/tr: use streaming stats api

TODO:
 * how to handle errors and closed channel during stats streaming
 * prevent tight loop if Stats(ctx) returns an error

drivers: update drivers TaskStats RPC to handle streaming results

executor: better error handling in stats rpc

docker: better control and error handling of stats rpc

driver: allow stats to return a recoverable error
2019-01-12 12:18:22 -05:00
Alex Dadgar
270ae48b82 Plugins use parent loggers
This PR fixes various instances of plugins being launched without using
the parent loggers. This meant that logs would not all go to the same
output, break formatting etc.
2019-01-11 11:36:37 -08:00
Mahmood Ali
d1fbd735f3 Merge pull request #5157 from hashicorp/r-drivers-no-cstructs
drivers: avoid referencing client/structs package
2019-01-09 13:06:46 -05:00
Mahmood Ali
060588da31 executor: add a comment detailing isolation 2019-01-08 12:10:26 -05:00
Mahmood Ali
800a3522e3 drivers: re-export ResourceUsage structs
Re-export the ResourceUsage structs in drivers package to avoid drivers
directly depending on the internal client/structs package directly.

I attempted moving the structs to drivers, but that caused some import
cycles that was a bit hard to disentagle.  Alternatively, I added an
alias here that's sufficient for our purposes of avoiding external
drivers depend on internal packages, while allowing us to restructure
packages in future without breaking source compatibility.
2019-01-08 09:11:47 -05:00
Mahmood Ali
c26dfb000c drivers/exec: restrict devices exposed to tasks
We ultimately decided to provide a limited set of devices in exec/java
drivers instead of all of host ones.  Pre-0.9, we made all host devices
available to exec tasks accidentally, yet most applications only use a
small subset, and this choice limits our ability to restrict/isolate GPU
and other devices.

Starting with 0.9, by default, we only provide the same subset of
devices Docker provides, and allow users to provide more devices as
needed on case-by-case basis.

This reverts commit 5805c64a9f.
This reverts commit ff9a4a17e5.
2019-01-06 17:03:19 -05:00
Mahmood Ali
5805c64a9f driver/exec: use dedicated /dev mount (#5147)
Use a dedicated /dev mount so we can inject more devices if necessary,
and avoid allowing a container to contaminate host /dev.

Follow up to https://github.com/hashicorp/nomad/pull/5143 - and fixes master.
2019-01-04 10:36:05 -05:00
Mahmood Ali
ff9a4a17e5 drivers/exec: bind mount /dev into rootfs
Restores pre-0.9 behavior, where Nomad makes /dev available to exec
task.  Switching to libcontainer, we accidentally made only a small
subset available.

Here, we err on the side of preserving behavior of 0.8, instead of going
for the sensible route, where only a reasonable subset of devices is
mounted by default and user can opt to request more.
2019-01-03 14:29:18 -05:00
Alex Dadgar
d5512c39f0 Lint 2018-12-18 15:50:44 -08:00
Alex Dadgar
cd6879409c Drivers 2018-12-18 15:50:11 -08:00
Nick Ethier
8a344412e8 Merge branch 'master' into f-grpc-executor
* master: (71 commits)
  Fix output of 'nomad deployment fail' with no arg
  Always create a running allocation when testing task state
  tests: ensure exec tests pass valid task resources (#4992)
  some changes for more idiomatic code
  fix iops related tests
  fixed bug in loop delay
  gofmt
  improved code for readability
  client: updateAlloc release lock after read
  fixup! device attributes in `nomad node status -verbose`
  drivers/exec: support device binds and mounts
  fix iops bug and increase test matrix coverage
  tests: tag image explicitly
  changelog
  ci: install lxc-templates explicitly
  tests: skip checking rdma cgroup
  ci: use Ubuntu 16.04 (Xenial) in TravisCI
  client: update driver info on new fingerprint
  drivers/docker: enforce volumes.enabled (#4983)
  client: Style: use fluent style for building loggers
  ...
2018-12-13 14:41:09 -05:00
Mahmood Ali
97f33bb153 drivers/exec: support device binds and mounts 2018-12-11 18:35:21 -05:00
Alex Dadgar
f42c060d35 Merge pull request #4970 from hashicorp/f-no-iops
Deprecate IOPS
2018-12-11 12:51:22 -08:00
Mahmood Ali
2fb5e35012 Merge pull request #4950 from hashicorp/b-exc-libcontainer-kill
executor: kill all container processes
2018-12-08 09:52:42 -05:00
Nick Ethier
a6cb63a964 executor: don't drop errors when configuring libcontainer cfg, add nil check on resources 2018-12-07 14:03:42 -05:00
Nick Ethier
224c68860d executor: use drivers.Resources as resource model 2018-12-06 21:22:02 -05:00
Nick Ethier
0087a51a7a executor: remove structs package 2018-12-06 20:54:14 -05:00
Alex Dadgar
0953d913ed Deprecate IOPS
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
2018-12-06 15:09:26 -08:00