diff --git a/website/data/docs-navigation.js b/website/data/docs-navigation.js index d0b3e5480..024801488 100644 --- a/website/data/docs-navigation.js +++ b/website/data/docs-navigation.js @@ -49,8 +49,9 @@ export default [ content: ['scheduling', 'preemption'], }, 'consensus', + 'filesystem', 'gossip', - 'security', + 'security' ], }, { diff --git a/website/pages/docs/internals/filesystem.mdx b/website/pages/docs/internals/filesystem.mdx new file mode 100644 index 000000000..d9975802b --- /dev/null +++ b/website/pages/docs/internals/filesystem.mdx @@ -0,0 +1,465 @@ +--- +layout: docs +page_title: Filesystem +sidebar_title: Filesystem +description: |- + Nomad creates an allocation working directory for every allocation. Learn what + goes into the working directory and how it interacts with Nomad task drivers. +--- + +# Filesystem + +Nomad creates a working directory for each allocation on a client. This +directory can be found in the Nomad [`data_dir`] at +`./allocs/«alloc_id»`. The allocation working directory is where Nomad +creates task directories and directories shared between tasks, write logs for +tasks, and downloads artifacts or templates. + +An allocation with two tasks (named `task1` and `task2`) will have an +allocation directory like the one below. + +```shell-session +. +├── alloc +│ ├── data +│ ├── logs +│ │ ├── task1.stderr.0 +│ │ ├── task1.stdout.0 +│ │ ├── task2.stderr.0 +│ │ └── task2.stdout.0 +│ └── tmp +├── task1 +│ ├── local +│ ├── secrets +│ └── tmp +└── task2 + ├── local + ├── secrets + └── tmp +``` + +- **alloc/**: This directory is shared across all tasks in an allocation and + can be used to store data that needs to be used by multiple tasks, such as a + log shipper. This is the directory that's provided to the task as the + `NOMAD_ALLOC_DIR`. Note that this `alloc/` directory is not the same as the + "allocation working directory", which is the top-level directory. All tasks + in a task group can read and write to the `alloc/` directory. Within the + `alloc/` directory are three standard directories: + + - **alloc/data/**: This directory is the location used by the + [`ephemeral_disk`] stanza for shared data. + + - **alloc/logs/**: This directory is the location of the log files for every + task within an allocation. The `nomad alloc logs` command streams these + files to your terminal. + + - **alloc/tmp/**: A temporary directory used as scratch space by task drivers. + +- **«taskname»**: Each task has a **task working directory** with the same name as + the task. Tasks in a task group can't read each other's task working + directory. Depending on the task driver's [filesystem isolation mode], a + task may not be able to access the task working directory. Within the + `task/` directory are three standard directories: + + - **«taskname»/local/**: This directory is the location provided to the task as the + `NOMAD_TASK_DIR`. Note this is not the same as the "task working + directory". This directory is private to the task. + + - **«taskname»/secrets/**: This directory is the location provided to the task as + `NOMAD_SECRETS_DIR`. The contents of files in this directory cannot be read + the the `nomad alloc fs` command. It can be used to store secret data that + should not be visible outside the task. + + - **«taskname»/tmp/**: A temporary directory used as scratch space by task drivers. + +The allocation working directory is the directory you see when using the +`nomad alloc fs` command. If you were to run `nomad alloc fs` against the +allocation that made the working directory shown above, you'd see the +following: + +```shell-session +$ nomad alloc fs c0b2245f +Mode Size Modified Time Name +drwxrwxrwx 4.0 KiB 2020-10-27T18:00:39Z alloc/ +drwxrwxrwx 4.0 KiB 2020-10-27T18:00:32Z task1/ +drwxrwxrwx 4.0 KiB 2020-10-27T18:00:39Z task2/ + +$ nomad alloc fs c0b2245f alloc/ +Mode Size Modified Time Name +drwxrwxrwx 4.0 KiB 2020-10-27T18:00:32Z data/ +drwxrwxrwx 4.0 KiB 2020-10-27T18:00:39Z logs/ +drwxrwxrwx 4.0 KiB 2020-10-27T18:00:32Z tmp/ + +$ nomad alloc fs c0b2245f task1/ +Mode Size Modified Time Name +drwxrwxrwx 4.0 KiB 2020-10-27T18:00:33Z local/ +drwxrwxrwx 60 B 2020-10-27T18:00:32Z secrets/ +dtrwxrwxrwx 4.0 KiB 2020-10-27T18:00:32Z tmp/ +``` + +## Task Drivers and Filesystem Isolation Modes + +Depending on the task driver, the task's working directory may also be the +root directory for the running task. This is determined by the task driver's +[filesystem isolation capability]. + +### `image` isolation + +Task drivers like `docker` or `qemu` use `image` isolation, where the task +driver isolates task filesystems as machine images. These filesystems are +owned by the task driver's external process and not by Nomad itself. These +filesystems will not typically be found anywhere in the allocation working +directory. For example, Docker containers will have their overlay filesystem +unpacked to `/var/run/docker/containerd/«container_id»` by default. + +Nomad will provide the `NOMAD_ALLOC_DIR`, `NOMAD_TASK_DIR`, and +`NOMAD_SECRETS_DIR` to tasks with `image` isolation, typically by +bind-mounting them to the task driver's filesystem. + +You can see an example of `image` isolation by running the following minimal +job: + +```hcl +job "example" { + datacenters = ["dc1"] + + task "task1" { + driver = "docker" + + config { + image = "redis:6.0" + } + } +} +``` + +If you look at the allocation working directory from the host, you'll see a +minimal filesystem tree: + +```shell-session +. +├── alloc +│ ├── data +│ ├── logs +│ │ ├── task1.stderr.0 +│ │ └── task1.stdout.0 +│ └── tmp +└── task1 + ├── local + ├── secrets + └── tmp +``` + +The `nomad alloc fs` command shows the same bare directory tree: + +```shell-session +$ nomad alloc fs b0686b27 +Mode Size Modified Time Name +drwxrwxrwx 4.0 KiB 2020-10-27T18:51:54Z alloc/ +drwxrwxrwx 4.0 KiB 2020-10-27T18:51:54Z task1/ + +$ nomad alloc fs b0686b27 task1 +Mode Size Modified Time Name +drwxrwxrwx 4.0 KiB 2020-10-27T18:51:54Z local/ +drwxrwxrwx 60 B 2020-10-27T18:51:54Z secrets/ +dtrwxrwxrwx 4.0 KiB 2020-10-27T18:51:54Z tmp/ + +$ nomad alloc fs b0686b27 task1/local +Mode Size Modified Time Name +``` + +If you inspect the Docker container that's created, you'll see three +directories bind-mounted into the container: + +```shell-session +$ docker inspect 32e | jq '.[0].HostConfig.Binds' +[ + "/var/nomad/alloc/b0686b27-8af3-8252-028f-af485c81a8b3/alloc:/alloc", + "/var/nomad/alloc/b0686b27-8af3-8252-028f-af485c81a8b3/task1/local:/local", + "/var/nomad/alloc/b0686b27-8af3-8252-028f-af485c81a8b3/task1/secrets:/secrets" +] +``` + +The root filesystem inside the container can see these three mounts, along +with the rest of the container filesystem: + +```shell-session +$ docker exec -it 32e /bin/sh +# ls / +alloc boot dev home lib64 media opt root sbin srv tmp var +bin data etc lib local mnt proc run secrets sys usr +``` + +Note that because the three directories are bind-mounted into the container +filesystem, nothing written outside those three directories elsewhere in the +allocation working directory will be accessible inside the container. This +means templates, artifacts, and dispatch payloads for tasks with `image` +isolation must be written into the `NOMAD_ALLOC_DIR`, `NOMAD_TASK_DIR`, or +`NOMAD_SECRETS_DIR`. + +To work around this limitation, you can use the task driver's mounting +capabilities to mount one of the three directories to another location in the +task. For example, with the Docker driver you can use the driver's `mounts` +block to bind a secret written by a `template` block to the +`NOMAD_SECRETS_DIR` into a configuration directory elsewhere in the task: + +```hcl +job "example" { + datacenters = ["dc1"] + + task "task1" { + driver = "docker" + + config { + image = "redis:6.0" + mounts = [{ + type = "bind" + source = "secrets" + target = "/etc/redis.d" + readonly = true + }] + + template { + destination = "${NOMAD_SECRETS_DIR}/redis.conf" + data = <)` - Specifies the location where the resulting template should be rendered, relative to the [task working directory]. Only drivers without filesystem isolation (ex. `raw_exec`) or - that buiold a chroot in the task working directory (ex. `exec`) can render + that build a chroot in the task working directory (ex. `exec`) can render templates outside of the `NOMAD_ALLOC_DIR`, `NOMAD_TASK_DIR`, or - `NOMAD_SECRETS_DIR`. + `NOMAD_SECRETS_DIR`. For more details on how `destination` interacts with + task drivers, see the [Filesystem internals] documentation. - `env` `(bool: false)` - Specifies the template should be read back in as environment variables for the task. ([See below](#environment-variables)) @@ -385,3 +386,4 @@ options](/docs/configuration/client#options): [nodevars]: /docs/runtime/interpolation#interpreted_node_vars 'Nomad Node Variables' [go-envparse]: https://github.com/hashicorp/go-envparse#readme 'The go-envparse Readme' [task working directory]: /docs/runtime/environment#task-directories 'Task Directories' +[Filesystem internals]: /docs/internals/filesystem#templates-artifacts-and-dispatch-payloads diff --git a/website/pages/docs/runtime/environment.mdx b/website/pages/docs/runtime/environment.mdx index a328d67f9..e74269dba 100644 --- a/website/pages/docs/runtime/environment.mdx +++ b/website/pages/docs/runtime/environment.mdx @@ -25,9 +25,9 @@ environment variable names such as `NOMAD_ADDR__