Commit Graph

263 Commits

Author SHA1 Message Date
Seth Hoenig
a869394a03 env_aws: combine 3 log lines into 1 2020-04-29 10:47:36 -06:00
Seth Hoenig
0d5d1781d3 env_aws: downgrade log line
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
2020-04-29 10:34:26 -06:00
Seth Hoenig
f47c57fa2d env_aws: fixup log line
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
2020-04-29 10:33:53 -06:00
Seth Hoenig
9230fa9eff env_aws: use best-effort lookup table for CPU performance in EC2
Fixes #7681

The current behavior of the CPU fingerprinter in AWS is that it
reads the **current** speed from `/proc/cpuinfo` (`CPU MHz` field).

This is because the max CPU frequency is not available by reading
anything on the EC2 instance itself. Normally on Linux one would
look at e.g. `sys/devices/system/cpu/cpuN/cpufreq/cpuinfo_max_freq`
or perhaps parse the values from the `CPU max MHz` field in
`/proc/cpuinfo`, but those values are not available.

Furthermore, no metadata about the CPU is made available in the
EC2 metadata service.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-categories.html

Since `go-psutil` cannot determine the max CPU speed it defaults to
the current CPU speed, which could be basically any number between
0 and the true max. This is particularly bad on large, powerful
reserved instances which often idle at ~800 MHz while Nomad does
its fingerprinting (typically IO bound), which Nomad then uses as
the max, which results in severe loss of available resources.

Since the CPU specification is unavailable programmatically (at least
not without sudo) use a best-effort lookup table. This table was
generated by going through every instance type in AWS documentation
and copy-pasting the numbers.
https://aws.amazon.com/ec2/instance-types/

This approach obviously is not ideal as future instance types will
need to be added as they are introduced to AWS. However, using the
table should only be an improvement over the status quo since right
now Nomad miscalculates available CPU resources on all instance types.
2020-04-28 19:01:33 -06:00
Mahmood Ali
6199e96972 fixup! tests: Add tests for EC2 Metadata immitation cases 2020-03-26 11:37:54 -04:00
Mahmood Ali
0c1dd0e75b fixup! tests: Add tests for EC2 Metadata immitation cases 2020-03-26 11:33:44 -04:00
Mahmood Ali
e37f7af811 fingerprint: handle incomplete AWS immitation APIs
Fix a regression where we accidentally started treating non-AWS
environments as AWS environments, resulting in bad networking settings.

Two factors some at play:

First, in [1], we accidentally switched the ultimate AWS test from
checking `ami-id` to `instance-id`.  This means that nomad started
treating more environments as AWS; e.g. Hetzner implements `instance-id`
but not `ami-id`.

Second, some of these environments return empty values instead of
errors!  Hetzner returns empty 200 response for `local-ipv4`, resulting
into bad networking configuration.

This change fix the situation by restoring the check to `ami-id` and
ensuring that we only set network configuration when the ip address is
not-empty.  Also, be more defensive around response whitespace input.

[1] https://github.com/hashicorp/nomad/pull/6779
2020-03-26 11:23:15 -04:00
Mahmood Ali
500c3c2d87 tests: Add tests for EC2 Metadata immitation cases
Test that nomad doesn't set empty/bad network configuration when in an
environment that does incomplete immitation of EC2 Metadata API.
2020-03-26 11:13:21 -04:00
Danielle
59eb882197 Update client/fingerprint/env_aws.go
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
2019-12-16 14:48:52 +01:00
Danielle Lancashire
544807ba79 env_aws: Disable Retries and set Session cfg
Previously, Nomad used hand rolled HTTP requests to interact with the
EC2 metadata API. Recently however, we switched to using the AWS SDK for
this fingerprinting.

The default behaviour of the AWS SDK is to perform retries with
exponential backoff when a request fails. This is problematic for Nomad,
because interacting with the EC2 API is in our client start path.

Here we revert to our pre-existing behaviour of not performing retries
in the fast path, as if the metadata service is unavailable, it's likely
that nomad is not running in AWS.
2019-12-16 10:56:32 +01:00
Mahmood Ali
b55bc6443e fingerprint code refactor
Some code cleanup:

* Use a field for setting EC2 metadata instead of env-vars in testing;
but keep environment variables for backward compatibility reasons

* Update tests to use testify
2019-11-26 10:51:28 -05:00
Mahmood Ali
f24dd5bec9 fingerprint: avoid api query if config overrides it 2019-11-26 10:51:28 -05:00
Mahmood Ali
34b9ee5433 fingerprint: use ec2metadata package 2019-11-26 10:51:27 -05:00
Mahmood Ali
816f8fbb20 Revert "lint: ignore generated windows syscall wrappers"
This reverts commit 482862e6ab.
2019-10-22 08:23:44 -04:00
Michael Schurter
aed2691ca7 Remove 0.10.0-rc1 generated files 2019-10-10 13:31:42 -07:00
Mahmood Ali
97448cf33d Merge pull request #6260 from hashicorp/c-circleci-tweak-20190903
ci: ignore nested pkgs in GOTEST_PKGS_EXCLUDE
2019-09-11 11:17:10 -07:00
Danielle
b56b9c60a2 fingerprint: Add backwards compatibility comment
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
2019-09-04 17:38:35 +02:00
Danielle Lancashire
e9a8e9ea0a fingerprint: Restore support for legacy memory fingerprint 2019-09-04 17:10:28 +02:00
Mahmood Ali
482862e6ab lint: ignore generated windows syscall wrappers 2019-09-03 10:59:58 -04:00
Danielle Lancashire
c20e604d2d client: Fix memory fingerprinting on 32bit
Also introduce regression ci for 32 bit fingerprinting
2019-08-31 18:33:59 +02:00
Lang Martin
33f550fb52 Merge pull request #5553 from hashicorp/b-fingerprinter-manual-config
client fingerprinter doesn't overwrite manual configuration
2019-04-26 12:55:34 -04:00
Lang Martin
583ae3722c client fingerprinter doesn't overwrite manual configuration
Revert "Revert accidental merge of pr #5482"
This reverts commit c45652ab8c.
2019-04-19 15:23:48 -04:00
Yorick Gersie
77a8fda87c fix nil pointer in fingerprinting AWS env leading to crash
HTTP Client returns a nil response if an error has occured. We first
  need to check for an error before being able to check the HTTP response
  code.
2019-04-19 11:07:13 +02:00
Lang Martin
c45652ab8c Revert accidental merge of pr #5482
Revert "fingerprint Constraints and Affinities have Equals, as set"
This reverts commit 596f16fb5f.

Revert "client tests assert the independent handling of interface and speed"
This reverts commit 7857ac5993.

Revert "structs missed applying a style change from the review"
This reverts commit 658916e327.

Revert "client, structs comments"
This reverts commit be2838d6ba.

Revert "client fingerprint updateNetworks preserves the network configuration"
This reverts commit fc309cb430.

Revert "client_test cleanup comments from review"
This reverts commit bc0bf4efb9.

Revert "client Networks Equals is set equality"
This reverts commit f8d432345b.

Revert "struct cleanup indentation in RequestedDevice Equals"
This reverts commit f4746411ca.

Revert "struct Equals checks for identity before value checking"
This reverts commit 0767a4665e.

Revert "fix client-test, avoid hardwired platform dependecy on lo0"
This reverts commit e89dbb2ab1.

Revert "refactor error in client fingerprint to include the offending data"
This reverts commit a7fed726c6.

Revert "add client updateNodeResources to merge but preserve manual config"
This reverts commit 84bd433c7e.

Revert "refactor struts.RequestedDevice to have its own Equals"
This reverts commit 6897825240.

Revert "refactor structs.Resource.Networks to have its own Equals"
This reverts commit 49e2e6c77b.

Revert "refactor structs.Resource.Devices to have its own Equals"
This reverts commit 4ede9226bb.

Revert "add COMPAT(0.10): Remove in 0.10 notes to impl for structs.Resources"
This reverts commit 49fbaace52.

Revert "add structs.Resources Equals"
This reverts commit 8528a2a2a6.

Revert "test that fingerprint resources are updated, net not clobbered"
This reverts commit 8ee02ddd23.
2019-04-11 10:29:40 -04:00
Lang Martin
a7fed726c6 refactor error in client fingerprint to include the offending data 2019-04-11 09:56:22 -04:00
Alex Dadgar
95297c608c goimports 2019-01-22 15:44:31 -08:00
Danielle Tomlinson
b05d362eb1 fingerprinter: Use HCLogger for windows 2019-01-17 18:43:13 +01:00
Danielle Tomlinson
da48a7eab3 client: Move fingerprint structs to pkg
This removes a cyclical dependency when importing client/structs from
dependencies of the plugin_loader, specifically, drivers. Due to
client/config also depending on the plugin_loader.

It also better reflects the ownership of fingerprint structs, as they
are fairly internal to the fingerprint manager.
2018-12-01 17:10:39 +01:00
Alex Dadgar
627e20801d Fix lints 2018-10-16 16:56:56 -07:00
Alex Dadgar
3c0b073513 compile on windows 2018-10-16 16:56:56 -07:00
Michael Schurter
796f0ca063 fix build errors post merges 2018-10-16 16:53:31 -07:00
Alex Dadgar
9a2c2a4f68 client uses passed logger and fix fingerprinters 2018-10-16 16:53:30 -07:00
Alex Dadgar
5e67b37aad use int64 2018-10-16 15:34:32 -07:00
Preetha Appan
3ca71ae935 Change CPU/Disk/MemoryMB to int everywhere in new resource structs 2018-10-16 16:21:42 -05:00
Alex Dadgar
e30b20e65e renames 2018-10-04 14:57:25 -07:00
Alex Dadgar
0f2f4797cb fixing tests 2018-10-04 14:26:19 -07:00
Alex Dadgar
b310a54aa6 Node resources on client 2018-09-29 17:23:41 -07:00
James Rasell
46037d70ee Merge branch 'master' into f_gh_4381 2018-06-19 17:51:57 +02:00
Alex Dadgar
ab661d2af5 lint 2018-06-13 16:06:39 -07:00
Alex Dadgar
98c7abe541 Tests only use testlog package logger 2018-06-13 15:40:56 -07:00
James Rasell
40a66756cd Add 'nomad.advertise.address' to client meta via NomadFingerPrint
This change removes the addition of the advertise address to the
exported task env vars and instead moves this work into the
NomadFingerprint.Fingerprint which adds this value to the client
attrs. This can then be used within a Nomad job like
${attr.nomad.advertise.address}.
2018-06-08 09:44:10 +02:00
Alex Dadgar
97ad9dfc92 bump version/lint/generated files 2018-06-01 15:23:10 -07:00
Alex Dadgar
ec95677a4d Add test and docs 2018-05-31 18:05:03 -07:00
Seth Vargo
ba6111e2a4 Set user-agent when talking to GCE metadata 2018-04-10 10:36:46 -04:00
Preetha Appan
30d104251b Code review feedback and unit test 2018-03-28 10:07:15 -05:00
Mildred Ki'Lya
d31105c69e Allow to specify total memory on agent configuration
Allow to set the total memory of an agent in its configuration file. This
can be used in case the automatic detection doesn't work or in specific
environments when memory overcommit (using swap for example) can be
desirable.
2018-03-27 15:46:18 -05:00
Chelsea Holland Komlo
1570972cb3 add concept of health checks to fingerprinters and nodes
fix up feedback from code review

add driver info for all drivers to node
2018-03-21 15:15:25 -04:00
Josh Soref
a851a79407 spelling: verify 2018-03-11 19:13:32 +00:00
Josh Soref
3124128554 spelling: mount 2018-03-11 18:27:18 +00:00
Josh Soref
fb7a5d6699 spelling: interface 2018-03-11 18:15:37 +00:00