From 08f386e8e569ae414b48c0eb9abee6013af4dcd9 Mon Sep 17 00:00:00 2001 From: Juanadelacuesta <8647634+Juanadelacuesta@users.noreply.github.com> Date: Tue, 11 Mar 2025 17:44:17 +0100 Subject: [PATCH 1/5] docs: Add section of readme to add workloads --- enos/README.md | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+) diff --git a/enos/README.md b/enos/README.md index 50bd4f1a3..0d1dc1d2e 100644 --- a/enos/README.md +++ b/enos/README.md @@ -124,3 +124,78 @@ $ $(./debug-environment .enos/c545bbc25c5eec0ca86c99595a9034b5451a91aa10b586da2b Note that this won't be fully populated until the Enos scenario is far enough along to bootstrap the Nomad cluster. + +## Adding New Workloads + +All workloads executed as part of the test suite are stored under +`enos/modules/run_workloads/jobs` and must be defined with the following +attributes: + +### Required Attributes + +- **`job_spec`** *(string)*: Path to the job specification for your workload. + The path should be relative to the `run_workloads` module. + For example: `jobs/raw-exec-service.nomad.hcl`. + +- **`alloc_count`** *(number)*: This variable serves two purposes: + 1. Every workload must define the `alloc_count` variable, regardless of + whether it is actively used. + This is because jobs are executed using [this command](https://github.com/hashicorp/nomad/blob/1ffb7ab3fb0dffb0e530fd3a8a411c7ad8c72a6a/enos/modules/run_workloads/main.tf#L66): + + ```hcl + variable "alloc_count" { + type = number + } + ``` + + 2. It is used to calculate the expected number of allocations in the cluster + once all jobs are running. + + If the variable is missing or left undefined, the job will fail to run, + which will impact the upgrade scenario. + + For `system` jobs, the number of allocations is determined by the number + of nodes. In such cases, `alloc_count` is conventionally set to `0`, + as it is not directly used. + +- **`type`** *(string)*: Specifies the type of workload—`service`, `batch`, or +`system`. Setting the correct type is important, as it affects the calculation +of the total number of expected allocations in the cluster. + +### Optional Attributes + +The following attributes are only required if your workload has prerequisites +or final configurations before it is fully operational. For example, a job using +`tproxy` may require a new intention to be configured in Consul. + +- **`pre_script`** *(optional, string)*: Path to a script that should be +executed before the job runs. +- **`post_script`** *(optional, string)*: Path to a script that should be + executed after the job runs. + +All scripts are located in `enos/modules/run_workloads/scripts`. +Similar to `job_spec`, the path should be relative to the `run_workloads` +module. Example: `scripts/wait_for_nfs_volume.sh`. + +### Adding a New Workload + +If you want to add a new workload to test a specific feature, follow these steps: + +1. Modify the `run_initial_workloads` [step](https://github.com/hashicorp/nomad/blob/main/enos/enos-scenario-upgrade.hcl) +in `enos-scenario-upgrade.hcl` and include your workload in the `workloads` +variable. + +2. Add the job specification and any necessary pre/post scripts to the +appropriate directories: + - [`enos/modules/run_workloads/jobs`](https://github.com/hashicorp/nomad/tree/main/enos/modules/run_workloads/jobs) + - [`enos/modules/run_workloads/scripts`](https://github.com/hashicorp/nomad/tree/main/enos/modules/run_workloads/scripts) + +**Important:** Ensure that the `alloc_count` variable is included in the job +specification. If it is missing or undefined, the job will fail to run, +potentially disrupting the upgrade scenario. + +If you want to verify your workload without having to run all the scenario, +you can manually pass values to variables with flags or a `.tfvars` +files and run the module from the `run_workloads` directory like you would any +other terraform module. + From 859f257d32e9d74ddcd7dc13a688d498210f1623 Mon Sep 17 00:00:00 2001 From: Juana De La Cuesta Date: Tue, 11 Mar 2025 17:48:45 +0100 Subject: [PATCH 2/5] Update README.md --- enos/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enos/README.md b/enos/README.md index 0d1dc1d2e..3a9785fa9 100644 --- a/enos/README.md +++ b/enos/README.md @@ -196,6 +196,6 @@ potentially disrupting the upgrade scenario. If you want to verify your workload without having to run all the scenario, you can manually pass values to variables with flags or a `.tfvars` -files and run the module from the `run_workloads` directory like you would any +file and run the module from the `run_workloads` directory like you would any other terraform module. From b1ea04a4d1aaf7350f15be0840e50b50be96e55c Mon Sep 17 00:00:00 2001 From: Juana De La Cuesta Date: Tue, 11 Mar 2025 17:50:26 +0100 Subject: [PATCH 3/5] Update README.md --- enos/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enos/README.md b/enos/README.md index 3a9785fa9..57ae09254 100644 --- a/enos/README.md +++ b/enos/README.md @@ -147,7 +147,7 @@ attributes: type = number } ``` - + This is done to force the job spec author to add a value to the `alloc_count`. 2. It is used to calculate the expected number of allocations in the cluster once all jobs are running. From 3de2a6b1d6b616c1c01209815454e96bb5e7ff34 Mon Sep 17 00:00:00 2001 From: Juana De La Cuesta Date: Tue, 11 Mar 2025 17:51:56 +0100 Subject: [PATCH 4/5] Update README.md --- enos/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enos/README.md b/enos/README.md index 57ae09254..4115b07df 100644 --- a/enos/README.md +++ b/enos/README.md @@ -181,7 +181,7 @@ module. Example: `scripts/wait_for_nfs_volume.sh`. If you want to add a new workload to test a specific feature, follow these steps: -1. Modify the `run_initial_workloads` [step](https://github.com/hashicorp/nomad/blob/main/enos/enos-scenario-upgrade.hcl) +1. Modify the `run_initial_workloads` [step](https://github.com/hashicorp/nomad/blob/04db81951fd0f6b7cc543410585a4da0d70a354a/enos/enos-scenario-upgrade.hcl#L139) in `enos-scenario-upgrade.hcl` and include your workload in the `workloads` variable. From ebeb3047c89bb168b556572a33065bc5012efeec Mon Sep 17 00:00:00 2001 From: Juanadelacuesta <8647634+Juanadelacuesta@users.noreply.github.com> Date: Wed, 12 Mar 2025 16:51:03 +0100 Subject: [PATCH 5/5] docs: add note about workloads life expectancy --- enos/README.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/enos/README.md b/enos/README.md index 4115b07df..804745550 100644 --- a/enos/README.md +++ b/enos/README.md @@ -127,7 +127,8 @@ along to bootstrap the Nomad cluster. ## Adding New Workloads -All workloads executed as part of the test suite are stored under +As part of the testing process some test workloads are dispatched and are +expected to run during all the update process, they are stored under `enos/modules/run_workloads/jobs` and must be defined with the following attributes: @@ -190,10 +191,17 @@ appropriate directories: - [`enos/modules/run_workloads/jobs`](https://github.com/hashicorp/nomad/tree/main/enos/modules/run_workloads/jobs) - [`enos/modules/run_workloads/scripts`](https://github.com/hashicorp/nomad/tree/main/enos/modules/run_workloads/scripts) -**Important:** Ensure that the `alloc_count` variable is included in the job +**Important:** +* Ensure that the `alloc_count` variable is included in the job specification. If it is missing or undefined, the job will fail to run, potentially disrupting the upgrade scenario. +* During normal execution of the test and to verify the health of the cluster, +the number of jobs and allocs running is verified multiple times at different +stages of the process. Make sure your job has a health check, to ensure it will +be restarted in case of unexpected failures and if it is a batch job, +it will not exit before the test has concluded. + If you want to verify your workload without having to run all the scenario, you can manually pass values to variables with flags or a `.tfvars` file and run the module from the `run_workloads` directory like you would any