nomad/website/content/docs/job-specification/group.mdx

---
layout: docs
page_title: group Block - Job Specification
description: |-
  The "group" block defines a series of tasks that should be co-located on the
  same Nomad client. Any task within a group will be placed on the same client.
---

# `group` Block

<Placement groups={['job', 'group']} />

The `group` block defines a series of tasks that should be co-located on the
same Nomad client. Any [task][] within a group will be placed on the same
client.

```hcl
job "docs" {
  group "example" {
    # ...
  }
}
```

## `group` Parameters

- `constraint` <code>([Constraint][]: nil)</code> -
  This can be provided multiple times to define additional constraints.

- `affinity` <code>([Affinity][]: nil)</code> - This can be provided
  multiple times to define preferred placement criteria.

- `spread` <code>([Spread][spread]: nil)</code> - This can be provided
  multiple times to define criteria for spreading allocations across a
  node attribute or metadata. See the
  [Nomad spread reference](/nomad/docs/job-specification/spread) for more details.

- `count` `(int)` - Specifies the number of instances that should be running
  under for this group. This value must be non-negative. This defaults to the
  `min` value specified in the [`scaling`](/nomad/docs/job-specification/scaling)
  block, if present; otherwise, this defaults to `1`.

- `consul` <code>([Consul][consul]: nil)</code> - Specifies Consul configuration
  options specific to the group. These options will be applied to all tasks and
  services in the group unless a task has its own `consul` block.

- `ephemeral_disk` <code>([EphemeralDisk][]: nil)</code> - Specifies the
  ephemeral disk requirements of the group. Ephemeral disks can be marked as
  sticky and support live data migrations.

- `disconnect` <code>([disconnect][]: nil)</code> - Specifies the disconnect
  strategy for the server and client for all tasks in this group in case of a
  network partition. The tasks can be left unconnected, stopped or replaced
  when the client disconnects. The policy for reconciliation in case the client
  regains connectivity is also specified here.

- `meta` <code>([Meta][]: nil)</code> - Specifies a key-value map that annotates
  with user-defined metadata.

- `migrate` <code>([Migrate][]: nil)</code> - Specifies the group strategy for
  migrating off of draining nodes. Only service jobs with a count greater than
  1 support migrate blocks.

- `network` <code>([Network][]: &lt;optional&gt;)</code> - Specifies the network
  requirements and configuration, including static and dynamic port allocations,
  for the group.

- `prevent_reschedule_on_lost` `(bool: false)` - Defines the replacement
  behavior of an allocation when the node it is running on misses heartbeats.
  When enabled, if the node disconnects or goes down,
  Nomad does not replace this allocation and shows it as `unknown` until the node
  reconnects or you manually restart the node.

  This behavior only modifies the replacement process on the server.  To
  modify the allocation behavior on the client, refer to
  [`stop_after_client_disconnect`](#stop_after_client_disconnect).

  The `unknown` allocation has to be manually stopped to run it again.

  ```plaintext
  `nomad alloc stop  <alloc ID>`
  ```

  Setting `max_client_disconnect` and `prevent_reschedule_on_lost = true` at the
  same time requires that [rescheduling is disabled entirely][`disable_rescheduling`].

  We deprecated this field in favor of `replace` on the [`disconnect`] block,
  see [example below][disconect_migration] for more details about migrating.

- `reschedule` <code>([Reschedule][]: nil)</code> - Allows to specify a
  rescheduling strategy. Nomad will then attempt to schedule the task on another
  node if any of the group allocation statuses become "failed".

- `restart` <code>([Restart][]: nil)</code> - Specifies the restart policy for
  all tasks in this group. If omitted, a default policy exists for each job
  type, which can be found in the [restart block documentation][restart].

- `service` <code>([Service][]: nil)</code> - Specifies integrations with Nomad
  or [Consul](/nomad/docs/configuration/consul) for service discovery. Nomad
  automatically registers each service when an allocation is started and
  de-registers them when the allocation is destroyed.

- `shutdown_delay` `(string: "0s")` - Specifies the duration to wait when
  stopping a group's tasks. The delay occurs between Consul or Nomad service
  deregistration and sending each task a shutdown signal. Ideally, services
  would fail health checks once they receive a shutdown signal. Alternatively,
  `shutdown_delay` may be set to give in-flight requests time to complete
  before shutting down. A group level `shutdown_delay` will run regardless
  if there are any defined group [services](/nomad/docs/job-specification/group#service)
  and only applies to these services. In addition, tasks may have their own
  [`shutdown_delay`](/nomad/docs/job-specification/task#shutdown_delay) which waits
  between de-registering task services and stopping the task.

- `stop_after_client_disconnect` `(string: "")` - Specifies a duration after
  which a Nomad client will stop allocations, if it cannot communicate with the
  servers. By default, a client will not stop an allocation until explicitly
  told to by a server. A client that fails to heartbeat to a server within the
  [`heartbeat_grace`] window and any allocations running on it will be marked
  "lost" and Nomad will schedule replacement allocations. The replaced
  allocations will normally continue to run on the non-responsive client. But
  you may want them to stop instead — for example, allocations requiring
  exclusive access to an external resource. When specified, the Nomad client
  will stop them after this duration.
  The Nomad client process must be running for this to occur. This setting
  cannot be used with [`max_client_disconnect`].

  This field was deprecated in favour of `stop_after` on the [`disconnect`] block.

- `max_client_disconnect` `(string: "")` - Specifies a duration during which a
  Nomad client will attempt to reconnect allocations after it fails to heartbeat
  in the [`heartbeat_grace`] window. See [the example code
  below][max-client-disconnect] for more details. This setting cannot be used
  with [`stop_after_client_disconnect`].

  This field was deprecated in favour of `lost_after` on the [`disconnect`] block.

- `task` <code>([Task][]: &lt;required&gt;)</code> - Specifies one or more tasks to run
  within this group. This can be specified multiple times, to add a task as part
  of the group.

- `update` <code>([Update][update]: nil)</code> - Specifies the task's update
  strategy. When omitted, a default update strategy is applied.

- `vault` <code>([Vault][]: nil)</code> - Specifies the set of Vault policies
  required by all tasks in this group. Overrides a `vault` block set at the
  `job` level.

- `volume` <code>([Volume][]: nil)</code> - Specifies the volumes that are
  required by tasks within the group.

## `group` Examples

The following examples only show the `group` blocks. Remember that the
`group` block is only valid in the placements listed above.

### Specifying Count

This example specifies that 5 instances of the tasks within this group should be
running:

```hcl
group "example" {
  count = 5
}
```

### Tasks with Constraint

This example shows two abbreviated tasks with a constraint on the group. This
will restrict the tasks to 64-bit operating systems.

```hcl
group "example" {
  constraint {
    attribute = "${attr.cpu.arch}"
    value     = "amd64"
  }

  task "cache" {
    # ...
  }

  task "server" {
    # ...
  }
}
```

### Metadata

This example show arbitrary user-defined metadata on the group:

```hcl
group "example" {
  meta {
    my-key = "my-value"
  }
}
```

### Network

This example shows network constraints as specified in the [network][] block
which uses the `bridge` networking mode, dynamically allocates two ports, and
statically allocates one port:

```hcl
group "example" {
  network {
    mode = "bridge"
    port "http" {}
    port "https" {}
    port "lb" {
      static = "8889"
    }
  }
}
```

### Service Discovery

This example creates a service in Consul. To read more about service discovery
in Nomad, please see the [Nomad service discovery documentation][service_discovery].

```hcl
group "example" {
  network {
    port "api" {}
  }

  service {
    name = "example"
    port = "api"
    tags = ["default"]

    check {
      type     = "tcp"
      interval = "10s"
      timeout  = "2s"
    }
  }

  task "api" { ... }
}
```

### Stop After Client Disconnect

This example shows how `stop_after_client_disconnect` interacts with
other blocks. For the `first` group, after the default 10 second
[`heartbeat_grace`] window expires and 90 more seconds passes, the
server will reschedule the allocation. The client will wait 90 seconds
before sending a stop signal (`SIGTERM`) to the `first-task`
task. After 15 more seconds because of the task's `kill_timeout`, the
client will send `SIGKILL`. The `second` group does not have
`stop_after_client_disconnect`, so the server will reschedule the
allocation after the 10 second [`heartbeat_grace`] expires. It will
not be stopped on the client, regardless of how long the client is out
of touch.

Note that if the server's clocks are not closely synchronized with
each other, the server may reschedule the group before the client has
stopped the allocation. Operators should ensure that clock drift
between servers is as small as possible.

Note also that a group using this feature will be stopped on the
client if the Nomad server cluster fails, since the client will be
unable to contact any server in that case. Groups opting in to this
feature are therefore exposed to an additional runtime dependency and
potential point of failure.

```hcl
group "first" {
  stop_after_client_disconnect = "90s"

  task "first-task" {
    kill_timeout = "15s"
  }
}

group "second" {

  task "second-task" {
    kill_timeout = "5s"
  }
}
```

### Max Client Disconnect

`max_client_disconnect` specifies a duration during which a Nomad client will
attempt to reconnect allocations after it fails to heartbeat in the
[`heartbeat_grace`] window.

By default, allocations running on a client that fails to heartbeat will be
marked "lost". When a client reconnects, its allocations, which may still be
healthy, will restart because they have been marked "lost". This can cause
issues with stateful tasks or tasks with long restart times.

Instead, an operator may desire that these allocations reconnect without a
restart. When `max_client_disconnect` or `disconnect.lost_after` is specified,
the Nomad server marks clients that fail to heartbeat as "disconnected"
rather than "down", and will mark allocations on a disconnected client as
"unknown" rather than "lost". These allocations may continue to run on the
disconnected client. Replacement allocations will be scheduled according to the
allocations' `disconnect.replace` settings. until the disconnected client
reconnects. Once a disconnected client reconnects, Nomad compares the "unknown"
allocations with their replacements and decides which ones to keep according
to the `disconnect.replace` setting. If the `max_client_disconnect` or
`disconnect.losta_after` duration expires before the client reconnects,
the allocations will be marked "lost".
Clients that contain "unknown" allocations will transition to "disconnected"
rather than "down" until the last `max_client_disconnect` or `disconnect.lost_after`
duration has expired.

In the example code below, if both of these task groups were placed on the same
client and that client experienced a network outage, both of the group's
allocations would be marked as "disconnected" at two minutes because of the
client's `heartbeat_grace` value of "2m". If the network outage continued for
eight hours, and the client continued to fail to heartbeat, the client would
remain in a "disconnected" state, as the first group's `max_client_disconnect`
is twelve hours. Once all groups' `max_client_disconnect` durations are
exceeded, in this case in twelve hours, the client node will be marked as "down"
and the allocation will be marked as "lost". If the client had reconnected
before twelve hours had passed, the allocations would gracefully reconnect
without a restart.

Max Client Disconnect is useful for edge deployments, or scenarios when
operators want zero on-client downtime due to node connectivity issues. This
setting cannot be used with [`stop_after_client_disconnect`].

```hcl
# server_config.hcl

server {
  enabled         = true
  heartbeat_grace = "2m"
}
```

```hcl
# jobspec.nomad

group "first" {
  max_client_disconnect = "12h"

  task "first-task" {
    ...
  }
}

group "second" {
  max_client_disconnect = "6h"

  task "second-task" {
    ...
  }
}
```

#### Max Client Disconnect and Prevent Reschedule On Lost

Setting `max_client_disconnect` and `prevent_reschedule_on_lost = true` at the
same time requires that [rescheduling is disabled entirely][`disable_rescheduling`].

```hcl
# jobspec.nomad

group "first" {
  max_client_disconnect      = "12h"
  prevent_reschedule_on_lost = true

  reschedule {
    attempts  = 0
    unlimited = false
  }

  task "first-task" {
    ...
  }
}
```

If [`max_client_disconnect`](#max_client_disconnect) is set and
`prevent_reschedule_on_lost = true`, allocations on disconnected nodes will be
`unknown` until the `max_client_disconnect` window expires, at which point
the node will be transition from `disconnected` to `down`. The allocation
will remain as `unknown` and won't be rescheduled.

#### Migration to `disconnect` block

The new configuration fileds in the disconnect block work exactly the same as the
ones they are replacing:
  * `stop_after_client_disconnect` is replaced by `stop_after`
  * `max_client_disconnect` is replaced by `lost_after`
  * `prevent_reschedule_on_lost` is replaced by `replace`

To keep the same behaviour as the old configuration upon reconnection, the
`reconcile` option should be set to `best_score`.

The following example shows how to migrate from the old configuration to the new one:

```hcl
job "docs" {
  group "example" {
    max_client_disconnect        = "6h"
    stop_after_client_disconnect = "2h"
    prevent_reschedule_on_lost   = true
  }
}
```
Can be directly translated to:

```hcl
job "docs" {
  	group "example" {
      disconnect {
        lost_after = "6h"
        stop_after = "2h"
        replace = false
        reconcile = "best_score"
      }
    }
  }
```

All use constrains still apply with the disconnect block as they did before:
 - `stop_after` and `lost_after` can't be used together.

[task]: /nomad/docs/job-specification/task 'Nomad task Job Specification'
[job]: /nomad/docs/job-specification/job 'Nomad job Job Specification'
[constraint]: /nomad/docs/job-specification/constraint 'Nomad constraint Job Specification'
[consul]: /nomad/docs/job-specification/consul
[consul_namespace]: /nomad/docs/commands/job/run#consul-namespace
[spread]: /nomad/docs/job-specification/spread 'Nomad spread Job Specification'
[affinity]: /nomad/docs/job-specification/affinity 'Nomad affinity Job Specification'
[ephemeraldisk]: /nomad/docs/job-specification/ephemeral_disk 'Nomad ephemeral_disk Job Specification'
[`heartbeat_grace`]: /nomad/docs/configuration/server#heartbeat_grace
[`max_client_disconnect`]: /nomad/docs/job-specification/group#max_client_disconnect
[`disable_rescheduling`]: /nomad/docs/job-specification/reschedule#disabling-rescheduling
[max-client-disconnect]: /nomad/docs/job-specification/group#max-client-disconnect 'the example code below'
[`stop_after_client_disconnect`]: /nomad/docs/job-specification/group#stop_after_client_disconnect
[meta]: /nomad/docs/job-specification/meta 'Nomad meta Job Specification'
[migrate]: /nomad/docs/job-specification/migrate 'Nomad migrate Job Specification'
[network]: /nomad/docs/job-specification/network 'Nomad network Job Specification'
[reschedule]: /nomad/docs/job-specification/reschedule 'Nomad reschedule Job Specification'
[disconnect]: /nomad/docs/job-specification/disconnect 'Nomad disconnect Job Specification'
[restart]: /nomad/docs/job-specification/restart 'Nomad restart Job Specification'
[service]: /nomad/docs/job-specification/service 'Nomad service Job Specification'
[service_discovery]: /nomad/docs/integrations/consul-integration#service-discovery 'Nomad Service Discovery'
[update]: /nomad/docs/job-specification/update 'Nomad update Job Specification'
[vault]: /nomad/docs/job-specification/vault 'Nomad vault Job Specification'
[volume]: /nomad/docs/job-specification/volume 'Nomad volume Job Specification'
[`consul.name`]: /nomad/docs/configuration/consul#name
[disconect_migration]: /nomad/docs/job-specification/group#migration_to_disconnect_block