fix data race in node upsert (#24127)

While testing with agents built with the race-detection option enabled, I
encountered a data race while draining a node.

When we upsert a node we copy the `NodeResources` struct and then perform a
fixup for backwards compatibility of the topology struct. This fixup was being
executed on the original struct and not the copy, which means we're uselessly
fixing up the wrong struct and we're corrupting the state store in the
process (albeit harmlessly, I suspect).

Fix the data race by calling the method on the correct pointer.
This commit is contained in:
Tim Gross
2024-10-04 08:41:14 -04:00
committed by GitHub
parent 1c76dd9c1c
commit 7531b7a62f
2 changed files with 7 additions and 4 deletions

3
.changelog/24127.txt Normal file
View File

@@ -0,0 +1,3 @@
```release-note:bug
state: Fixed a bug where compatibility updates for node topology for nodes older than 1.7.0 were not being correctly applied
```

View File

@@ -2251,7 +2251,7 @@ func (n *Node) Canonicalize() {
n.SchedulingEligibility = NodeSchedulingEligible
}
// COMPAT remove in 1.9+
// COMPAT remove in 1.10+
// In v1.7 we introduce Topology into the NodeResources struct which the client
// will fingerprint. Since the upgrade path must cover servers that get upgraded
// before clients which will send the old struct, we synthesize a pseudo topology
@@ -3262,9 +3262,9 @@ func (n *NodeResources) Copy() *NodeResources {
}
}
// COMPAT remove in 1.9+
// COMPAT remove in 1.10+
// apply compatibility fixups covering node topology
n.Compatibility()
newN.Compatibility()
return newN
}
@@ -3326,7 +3326,7 @@ func (n *NodeResources) Merge(o *NodeResources) {
}
}
// COMPAT remove in 1.9+
// COMPAT remove in 1.10+
// apply compatibility fixups covering node topology
n.Compatibility()
}