diff --git a/website/content/tools/autoscaling/concepts/index.mdx b/website/content/tools/autoscaling/concepts/index.mdx new file mode 100644 index 000000000..c5575ff57 --- /dev/null +++ b/website/content/tools/autoscaling/concepts/index.mdx @@ -0,0 +1,52 @@ +--- +layout: docs +page_title: Autoscaling Concepts +description: > + This section covers concepts of the Nomad Autoscaler and explains + technical details of its operation. +--- + +# Nomad Autoscaler Concepts + +This section covers concepts of the Nomad Autoscaler and explains the technical +details of how it functions, its architecture, and sub-systems. + +The Nomad Autoscaler is modeled around the concept of a closed-loop control +system. These types of systems are often at the core of self-regulating +mechanisms because they are able to adjust some value based on the current +state of the system and some user provided configuration. An example of a +closed-loop control system is a thermostat, where you set the desired +temperature and the appliance will regulate the output of cold and hot air to +make sure the room stays at the value set. + +In closed-loop systems there are a few key components: + +* **Setpoint** is the desired output as defined by the user. +* **Comparator** computes the difference between the setpoint and current + state of the system. +* **Controller** connects all the components together and defines what + needs to be done to bring the system closer to the desired state. +* **Actuator** applies the changes defined by the controller. +* **System** is the entity being controlled. +* **Output** is the current value of the system. +* **Sensor** reads the system output and translates it to a value that can be + used by the controller. + +[![Closed-loop controller](/img/autoscaling/control-loop.png)](/img/autoscaling/control-loop.png) + +The Nomad Autoscaler follows this same base architecture and offloads some of +the components to [different types of plugins](/tools/autoscaling/concepts/plugins). + +* The autoscaling **policy** is how users define their desired outcome and + control the Nomad Autoscaler. +* **Target** is what users want to scale. It can be a job group, where the + number of allocations is scaled, or a set of Nomad clients, where the number + of nodes is what changes. +* **Strategy plugins** receive the current status of the scaling target (such + as the number of allocations of a group) and metrics of the system to compute + what actions need to be taken. +* **Target plugins** communicate with targets to both read its status and to + apply changes defined by the Autoscaler. +* **APM plugins** read application performance metrics from external sources. + +[![Nomad Autoscaler architecture](/img/autoscaling/autoscaler-arch.png)](/img/autoscaling/autoscaler-arch.png) diff --git a/website/content/tools/autoscaling/internals/plugins/apm.mdx b/website/content/tools/autoscaling/concepts/plugins/apm.mdx similarity index 100% rename from website/content/tools/autoscaling/internals/plugins/apm.mdx rename to website/content/tools/autoscaling/concepts/plugins/apm.mdx diff --git a/website/content/tools/autoscaling/internals/plugins/base.mdx b/website/content/tools/autoscaling/concepts/plugins/base.mdx similarity index 100% rename from website/content/tools/autoscaling/internals/plugins/base.mdx rename to website/content/tools/autoscaling/concepts/plugins/base.mdx diff --git a/website/content/tools/autoscaling/internals/plugins/index.mdx b/website/content/tools/autoscaling/concepts/plugins/index.mdx similarity index 100% rename from website/content/tools/autoscaling/internals/plugins/index.mdx rename to website/content/tools/autoscaling/concepts/plugins/index.mdx diff --git a/website/content/tools/autoscaling/internals/plugins/strategy.mdx b/website/content/tools/autoscaling/concepts/plugins/strategy.mdx similarity index 100% rename from website/content/tools/autoscaling/internals/plugins/strategy.mdx rename to website/content/tools/autoscaling/concepts/plugins/strategy.mdx diff --git a/website/content/tools/autoscaling/internals/plugins/target.mdx b/website/content/tools/autoscaling/concepts/plugins/target.mdx similarity index 100% rename from website/content/tools/autoscaling/internals/plugins/target.mdx rename to website/content/tools/autoscaling/concepts/plugins/target.mdx diff --git a/website/content/tools/autoscaling/concepts/policy-eval/checks.mdx b/website/content/tools/autoscaling/concepts/policy-eval/checks.mdx new file mode 100644 index 000000000..122baf22a --- /dev/null +++ b/website/content/tools/autoscaling/concepts/policy-eval/checks.mdx @@ -0,0 +1,104 @@ +--- +layout: docs +page_title: Checks +description: Learn about how the Autoscaler deals with policy checks. +--- + +# Scaling Policy Checks + +A scaling policy can include several [checks][policy_check] all of which +produce a scaling suggestion. Each check can specify its own source of metrics +data and apply different strategies based on the desired outcome. + +```hcl +policy { + # ... + check "cpu_allocated_percentage" { + source = "prometheus" + query = "..." + + strategy "target-value" { + target = 70 + } + } + + check "high-memory-usage" { + source = "prometheus" + query = "..." + group = "memory-usage" + + strategy "threshold" { + upper_bound = 100 + lower_bound = 70 + delta = 1 + } + } + + check "low-memory-usage" { + source = "prometheus" + query = "..." + group = "memory-usage" + + strategy "threshold" { + upper_bound = 30 + lower_bound = 0 + delta = -1 + } + } +} +``` + +## Resolving Conflicts + +The checks are all executed at the same time during a policy evaluation and +they can generate conflicting scaling actions. In a scenario like this, the +Autoscaler iterates over the results and chooses the safest option, which is +defined as the action that results in retaining the most capacity of the +resource. + +In a scenario where two checks return different desired scaling directions, the +following logic is applied. + +- `ScaleOut and ScaleIn => ScaleOut` +- `ScaleOut and ScaleNone => ScaleOut` +- `ScaleIn and ScaleNone => ScaleNone` + +In situations where the same actions are suggested, but with different counts +the following logic is applied, where the count is the final desired value. + +- `ScaleOut(10) and ScaleOut(9) => ScaleOut(10)` +- `ScaleIn(3) and ScaleIn(4) => ScaleIn(4)` + +## Check Grouping + +The above logic for resolving conflicts only works when the checks are +independent from each other. If you use the same `query` in multiple `check` +blocks, or if the underlying data being queried is somehow correlated, only +one check will result in a scaling action. + +In the example above, the `high-memory-usage` and `low-memory-usage` checks use +the same query to retrieve memory usage information. We expect that memory +usage is either low or high (or neither), but never both at the same time. + +Without grouping the target is never be able to reduce its count, since the +possible resulting actions and the final scaling outcome can only be one of the +following: + +- `ScaleOut and ScaleNone => ScaleOut` +- `ScaleIn and ScaleNone => ScaleNone` +- `ScaleNone and ScaleNone => ScaleNone` + +To fix this problem, the correlated checks need to be set to the same `group`. +The Nomad Autoscaler then computes a single scaling action for the entire group +by applying a slightly different logic: + +- `ScaleOut and ScaleIn => ScaleOut` +- `ScaleOut and ScaleNone => ScaleOut` +- `ScaleIn and ScaleNone => ScaleIn` +- `ScaleNone and ScaleNone => ScaleNone` + +`ScaleNone` results are ignored unless all checks in the group return it and so +a group is able to `ScaleIn` a target even when all other checks results in no +action. + +[policy_check]: /tools/autoscaling/policy#check-options diff --git a/website/content/tools/autoscaling/concepts/policy-eval/index.mdx b/website/content/tools/autoscaling/concepts/policy-eval/index.mdx new file mode 100644 index 000000000..4e0deef45 --- /dev/null +++ b/website/content/tools/autoscaling/concepts/policy-eval/index.mdx @@ -0,0 +1,35 @@ +--- +layout: docs +page_title: Autoscaling Policy Evaluation +description: > + This section covers how scaling policies are evaluated to generate scaling + actions. +--- + +# Policy Evaluation + +When the Nomad Autoscaler [agent] starts it loads all the policies defined in +the [sources][agent_source] configured and monitors them for changes. Each +policy is assigned a handler that periodically sends the policy to a broker +where it is evaluated by a worker. The frequency the policy is enqueued is set +by its [`evaluation_interval`][policy_eval_interval]. + +The worker executes a series of steps by calling the different plugins used in +the policy to determine if a scaling action is needed and then to apply the +necessary actions. The worker then loops back to evaluate the next policy. + +If a scaling action is performed and the policy defines a +[`cooldown`][policy_cooldown] value the policy handler waits the specified +value before enqueuing it again. + +If the policy target are Nomad clients the target plugin will usually execute +more steps, such as [selecting nodes to be removed][concepts_node_selector] and +draining them. + +[![Scaling policy evaluation pipeline](/img/autoscaling/policy-eval.png)](/img/autoscaling/policy-eval.png) + +[agent]: /tools/autoscaling/agent +[agent_source]: /tools/autoscaling/agent/source +[concepts_node_selector]: /tools/autoscaling/concepts/policy-eval/node-selector-strategy +[policy_cooldown]: /tools/autoscaling/policy#cooldown +[policy_eval_interval]: /tools/autoscaling/policy#evaluation_interval diff --git a/website/content/tools/autoscaling/internals/node-selector-strategy.mdx b/website/content/tools/autoscaling/concepts/policy-eval/node-selector-strategy.mdx similarity index 100% rename from website/content/tools/autoscaling/internals/node-selector-strategy.mdx rename to website/content/tools/autoscaling/concepts/policy-eval/node-selector-strategy.mdx diff --git a/website/content/tools/autoscaling/internals/checks.mdx b/website/content/tools/autoscaling/internals/checks.mdx deleted file mode 100644 index 716ce8633..000000000 --- a/website/content/tools/autoscaling/internals/checks.mdx +++ /dev/null @@ -1,26 +0,0 @@ ---- -layout: docs -page_title: Checks -description: Learn about how the Autoscaler deals with policy checks. ---- - -# Nomad Autoscaler Check Calculations - -A scaling policy can include several checks all of which produce a scaling -suggesting. The checks are executed at the same time during a policy evaluation -and the results can conflict with each other. In a scenario like this, the -autoscaler iterates the results the chooses the safest result which results in -retaining the most capacity of the resource. - -In a scenario where two checks return different desired directions, the following -logic is applied. - -- `ScaleOut and ScaleIn => ScaleOut` -- `ScaleOut and ScaleNone => ScaleOut` -- `ScaleIn and ScaleNone => ScaleNone` - -In situations where the two same actions are suggested, but with different counts the -following logic is applied, where the count is the absolute desired value. - -- `ScaleOut(10) and ScaleOut(9) => ScaleOut(10)` -- `ScaleIn(3) and ScaleIn(4) => ScaleIn(4)` diff --git a/website/content/tools/autoscaling/internals/index.mdx b/website/content/tools/autoscaling/internals/index.mdx deleted file mode 100644 index b7844e129..000000000 --- a/website/content/tools/autoscaling/internals/index.mdx +++ /dev/null @@ -1,15 +0,0 @@ ---- -layout: docs -page_title: Internals -description: > - This section covers the internals of the Nomad Autoscaler and explains - technical details of its operation. ---- - -# Nomad Autoscaler Internals - -This section covers the internals of the Nomad Autoscaler and explains the -technical details of how it functions, its architecture, and sub-systems. - -- [Autoscaler plugins](/tools/autoscaling/internals/plugins) -- [Check calculations](/tools/autoscaling/internals/checks) diff --git a/website/content/tools/autoscaling/plugins/strategy/threshold.mdx b/website/content/tools/autoscaling/plugins/strategy/threshold.mdx index e5692ead0..f15b4e081 100644 --- a/website/content/tools/autoscaling/plugins/strategy/threshold.mdx +++ b/website/content/tools/autoscaling/plugins/strategy/threshold.mdx @@ -14,6 +14,10 @@ Multiple tiers can be defined by declaring more than one `check` in the same scaling policy. If there is any overlap between the bounds, the [safest `check`][internals_check] will be used. +~> **Note:** When using the `threshold` strategy with multiple checks make sure + they all have the same [`group`][policy_group] value, otherwise your target + may not be able to scale down. + ## Agent Configuration Options ```hcl @@ -29,6 +33,8 @@ policy { # ... check "high-memory-usage" { # ... + group = "memory-usage" + strategy "threshold" { upper_bound = 100 lower_bound = 70 @@ -36,8 +42,10 @@ policy { } } - check "low-memory-traffic" { + check "low-memory-usage" { # ... + group = "memory-usage" + strategy "threshold" { upper_bound = 30 lower_bound = 0 @@ -66,7 +74,8 @@ policy { as the new target count. Conflicts with `delta` and `percentage`. - `within_bounds_trigger` `(int: 5)` - The number of data points in the query - result time series that must be within the bound valus to trigger the action. + result time series that must be within the bound values to trigger the + action. At least one of `lower_bound` or `upper_bound` must be defined. If `lower_bound` is not defined, any value below `upper_bound` is considered @@ -76,3 +85,4 @@ within bounds. Similarly, if `upper_bound` is not defined, any value above One, and only one, of `delta`, `percentage`, or `value` must be defined. [internals_check]: /tools/autoscaling/internals/checks +[policy_group]: /tools/autoscaling/policy#group diff --git a/website/content/tools/autoscaling/policy.mdx b/website/content/tools/autoscaling/policy.mdx index 35394c8d4..cc64723dc 100644 --- a/website/content/tools/autoscaling/policy.mdx +++ b/website/content/tools/autoscaling/policy.mdx @@ -67,6 +67,10 @@ horizontal application scaling or horizontal cluster scaling. - `query_window` - Defines how far back to query the APM for metrics. It should be provided as a duration (e.g.: `"5s"`, `"1m"`). Defaults to `1m`. +- `group` - Specifies which checks should treated as correlated when the policy + is evaluated. Refer to [Check Grouping][concepts_grouping] for more + information. + - `on_error` - Defines how to handle errors during the `check` evaluation. Possible values are `"fail"` or `"ignore"`. If set to `"fail"` the policy evaluation will stop in case an error occurs and not scaling action will take @@ -236,6 +240,7 @@ scaling "mem" { } ``` +[concepts_grouping]: /tools/autoscaling/concepts/policy-eval/checks#check-grouping [das]: /tools/autoscaling#dynamic-application-sizing [policy_default_cooldown_agent]: /tools/autoscaling/agent#default_cooldown [eval_interval_agent]: /tools/autoscaling/agent#default_evaluation_interval diff --git a/website/data/tools-nav-data.json b/website/data/tools-nav-data.json index 34ad0ecce..51a278df1 100644 --- a/website/data/tools-nav-data.json +++ b/website/data/tools-nav-data.json @@ -1,4 +1,8 @@ [ + { + "title": "Overview", + "path": "index" + }, { "title": "Autoscaling", "routes": [ @@ -6,6 +10,57 @@ "title": "Overview", "path": "autoscaling" }, + { + "title": "Concepts", + "routes": [ + { + "title": "Overview", + "path": "autoscaling/concepts" + }, + { + "title": "Policy Evaluation", + "routes": [ + { + "title": "Overview", + "path": "autoscaling/concepts/policy-eval" + }, + { + "title": "Checks", + "path": "autoscaling/concepts/policy-eval/checks" + }, + { + "title": "Node Selector Strategy", + "path": "autoscaling/concepts/policy-eval/node-selector-strategy" + } + ] + }, + { + "title": "Plugins", + "routes": [ + { + "title": "Overview", + "path": "autoscaling/concepts/plugins" + }, + { + "title": "Base", + "path": "autoscaling/concepts/plugins/base" + }, + { + "title": "APM", + "path": "autoscaling/concepts/plugins/apm" + }, + { + "title": "Strategy", + "path": "autoscaling/concepts/plugins/strategy" + }, + { + "title": "Target", + "path": "autoscaling/concepts/plugins/target" + } + ] + } + ] + }, { "title": "Agent", "routes": [ @@ -166,48 +221,6 @@ "path": "autoscaling/plugins/external" } ] - }, - { - "title": "Internals", - "routes": [ - { - "title": "Overview", - "path": "autoscaling/internals" - }, - { - "title": "Checks", - "path": "autoscaling/internals/checks" - }, - { - "title": "Node Selector Strategy", - "path": "autoscaling/internals/node-selector-strategy" - }, - { - "title": "Plugins", - "routes": [ - { - "title": "Overview", - "path": "autoscaling/internals/plugins" - }, - { - "title": "Base", - "path": "autoscaling/internals/plugins/base" - }, - { - "title": "APM", - "path": "autoscaling/internals/plugins/apm" - }, - { - "title": "Strategy", - "path": "autoscaling/internals/plugins/strategy" - }, - { - "title": "Target", - "path": "autoscaling/internals/plugins/target" - } - ] - } - ] } ] } diff --git a/website/public/img/autoscaling/autoscaler-arch.png b/website/public/img/autoscaling/autoscaler-arch.png new file mode 100644 index 000000000..141533e34 Binary files /dev/null and b/website/public/img/autoscaling/autoscaler-arch.png differ diff --git a/website/public/img/autoscaling/control-loop.png b/website/public/img/autoscaling/control-loop.png new file mode 100644 index 000000000..5f8d07aeb Binary files /dev/null and b/website/public/img/autoscaling/control-loop.png differ diff --git a/website/public/img/autoscaling/policy-eval.png b/website/public/img/autoscaling/policy-eval.png new file mode 100644 index 000000000..636ecc14d Binary files /dev/null and b/website/public/img/autoscaling/policy-eval.png differ diff --git a/website/redirects.js b/website/redirects.js index d7faf17f9..bb9a2cf78 100644 --- a/website/redirects.js +++ b/website/redirects.js @@ -1 +1,18 @@ -module.exports = [] +module.exports = [ + // Rename and re-arrange Autoscaling Internals section + { + source: '/nomad/tools/autoscaling/internals/:path*', + destination: '/nomad/tools/autoscaling/concepts/:path*', + permanent: true, + }, + { + source: '/nomad/tools/autoscaling/concepts/checks', + destination: '/nomad/tools/autoscaling/concepts/policy-eval/checks', + permanent: true, + }, + { + source: '/nomad/tools/autoscaling/concepts/node-selector-strategy', + destination: '/nomad/tools/autoscaling/concepts/policy-eval/node-selector-strategy', + permanent: true, + }, +]