mirror of
https://github.com/kemko/nomad.git
synced 2026-01-01 16:05:42 +03:00
server: fix panic if heartbeat reset happens for GC'd node (#23383)
When setting up the timer for heartbeat invalidation, there's no control that allows us to remove that timer when the node is GC'd. If the GC window is narrow enough, it's possible to GC a node that has a waiting heartbeat timer. In this case, we hit a bug where querying for the node returns `nil` and this is incorrectly handled when checking for disconnect/reconnect state. Fix this bug by correctly handling a `nil` node and allowing the `Node.Update` RPC to fire normally (which then errors correctly). Fixes: https://github.com/hashicorp/nomad/issues/23376 Ref: https://hashicorp.atlassian.net/browse/NET-10109
This commit is contained in:
3
.changelog/23383.txt
Normal file
3
.changelog/23383.txt
Normal file
@@ -0,0 +1,3 @@
|
||||
```release-note:bug
|
||||
server: Fixed a bug where expiring heartbeats for garbage collected nodes could panic the server
|
||||
```
|
||||
@@ -183,6 +183,10 @@ func (h *nodeHeartbeater) disconnectState(id string) (bool, bool) {
|
||||
h.logger.Error("error retrieving node by id", "error", err)
|
||||
return false, false
|
||||
}
|
||||
if node == nil {
|
||||
h.logger.Error("node not found", "node_id", id)
|
||||
return false, false
|
||||
}
|
||||
|
||||
// Exit if the node is already down or just initializing.
|
||||
if node.Status == structs.NodeStatusDown || node.Status == structs.NodeStatusInit {
|
||||
|
||||
Reference in New Issue
Block a user