CA: refactor ClusterSnapshot methods #7466

towca · 2024-11-05T16:01:43Z

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This is a part of Dynamic Resource Allocation (DRA) support in Cluster Autoscaler. The ClusterSnapshot interface is cleaned up to facilitate later changes needed for DRA:

There were multiple methods for adding Nodes to the snapshot. This causes redundancy in ClusterSnapshot implementations for no clear reason. Instead of adding DRA handling to all these methods, they're replaced with AddNodeInfo which is DRA-aware already.
RemoveNode is renamed to RemoveNodeInfo for consistency with AddNodeInfo.
AddPod and RemovePod are renamed to SchedulePod and UnschedulePod. These names are more in-line with the method behavior when DRA is considered (a pod is not "removed" from the snapshot altogether, since we have to keep tracking its DRA objects).
An Initialize method is added. All other methods were Node or Pod specific, while for DRA the snapshot will also need to track DRA objects that are not bound to Nodes or Pods. Initialize() will be used to set these "global" DRA objects in later commits.

Which issue(s) this PR fixes:

The CA/DRA integration is tracked in kubernetes/kubernetes#118612, this is just part of the implementation.

Special notes for your reviewer:

This is intended to be a no-op refactor. It was extracted from #7350 after #7447.

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/blob/9de7f62e16fc5c1ea3bd40689487c9edc7fa5057/keps/sig-node/4381-dra-structured-parameters/README.md

/assign @MaciekPytel
/assign @jackfrancis

k8s-ci-robot · 2024-11-05T16:01:55Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: towca

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster-autoscaler/OWNERS~~ [towca]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pohly · 2024-11-06T12:08:19Z

/cc

jackfrancis · 2024-11-06T21:16:06Z

cluster-autoscaler/core/static_autoscaler.go

-				pods = append(pods, podInfo.Pod)
-			}
-			err := a.ClusterSnapshot.AddNodeWithPods(upcomingNode.Node(), pods)
+			err := a.ClusterSnapshot.AddNodeInfo(upcomingNode)


this would be a good opportunity to rename vars as follows:

upcomingNodes --> upcomingNodeInfos

upcomingNode --> upcomingNodeInfo

jackfrancis · 2024-11-06T21:30:58Z

cluster-autoscaler/simulator/clustersnapshot/basic.go

+
+	knownNodes := make(map[string]bool)
+	for _, node := range nodes {
+		if err := snapshot.AddNode(node); err != nil {


The only error condition for adding a node is if your []*apiv1.Node set has a duplicate. I wonder if there's a more efficient way of doing that targeted error handling earlier in the execution flow so we don't have to do so much error handling at this point. It would also have the side-benefit of allowing us to ditch this knownNodes accounting overhead in this function.

IMO this is the perfect place to validate this:

The alternative is for Initialize() to assume that some validation happened earlier and that its input is correct. This doesn't seem safe, as it relies on every Initialize() user properly validating data first.

There are multiple places that call Initialize(), so ideally we'd want to extract the validation logic to remove redundancy anyway. If we're extracting it outside of Initialize(), we essentially have 2 functions that always need to be called in sequence.

Keep in mind that this should be called once per snapshot per loop, so the knownNodes overhead should be trivial compared to the rest of the loop.

jackfrancis · 2024-11-07T00:46:47Z

cluster-autoscaler/simulator/clustersnapshot/clustersnapshot.go

@@ -41,8 +41,6 @@ type ClusterSnapshot interface {
 	AddPod(pod *apiv1.Pod, nodeName string) error
 	// RemovePod removes pod from the snapshot.
 	RemovePod(namespace string, podName string, nodeName string) error
-	// IsPVCUsedByPods returns if the pvc is used by any pod, key = <namespace>/<pvc_name>
-	IsPVCUsedByPods(key string) bool


There appear to be some other implementations of this interface and usages across the codebase (e.g., cluster-autoscaler/simulator/clustersnapshot/basic.go), sorry if those are in another commit that I didn't see! But think we'll need to clean this up everywhere.

This PR adapts both ClusterSnapshot implementations:

basic: https://github.com/kubernetes/autoscaler/pull/7466/files#diff-29296088fba933fa837b2efdce93b2423b19247d99f59f1d8ef1bc8d5e8c6915

delta: https://github.com/kubernetes/autoscaler/pull/7466/files#diff-e54b47f2d5a1526b9cde451164b5ab44aadd9f5a3ec5f93017ba4cd3a0faba05

Is there something missing?

Why do we no longer need this? IIRC this was never directly used in CA, but we needed to be able to satisfy the scheduler NodeInfo lister interface. Is that no longer the case? I feel like I'm missing some important part of CA/scheduler integration

jackfrancis · 2024-11-07T01:03:51Z

cluster-autoscaler/simulator/clustersnapshot/basic.go

@@ -164,7 +164,7 @@ func (data *internalBasicSnapshotData) removeNode(nodeName string) error {
 	return nil
 }

-func (data *internalBasicSnapshotData) addPod(pod *apiv1.Pod, nodeName string) error {
+func (data *internalBasicSnapshotData) schedulePod(pod *apiv1.Pod, nodeName string) error {


are these name changes really meaningful, given that we are simply wrapping the k/k scheduler's NodeInfo methods which will have the existing names?

Yeah, the point is that when we introduce the DRA logic this will no longer be just a wrapper around the schedulerframework.NodeInfo.AddPod. There will be additional DRA processing, as well as interacting with the scheduler framework plugins.

I just want to get the interface names changed in one go to minimize conflicts later.

DONOTSUBMIT

…ddNodeInfo We need AddNodeInfo in order to propagate DRA objects through the snapshot, which makes AddNodeWithPods redundant.

AddNodes() is redundant - it was indended for batch adding nodes, with batch-specific optimizations in mind probably. However, it has always been implemented as just iterating over AddNode(), and is only used in test code. Most of the uses in the test code were initialization - they are replaced with Initialize(), which will later be needed for handling DRA anyway. The other uses are replaced with inline loops over AddNode().

The method is already accessible via StorageInfos(), it's redundant.

AddNodeInfo already provides the same functionality, and has to be used in production code in order to propagate DRA objects correctly. Uses in production are replaced with Initialize(), which will later take DRA objects into account. Uses in the test code are replaced with AddNodeInfo().

AddPod is renamed to SchedulePod, RemovePod to UnschedulePod. This makes more sense in the DRA world as for DRA we're not only adding/removing the pod, but also modifying its ResourceClaims - but not adding/removing them (the ResourceClaims need to be tracked even for pods that aren't scheduled). RemoveNode is renamed to RemoveNodeInfo for consistency with other NodeInfo methods.

towca · 2024-11-13T12:19:54Z

/hold

DONOTSUBMIT

MaciekPytel · 2024-11-14T16:35:07Z

cluster-autoscaler/simulator/clustersnapshot/clustersnapshot.go

+	Initialize(nodes []*apiv1.Node, scheduledPods []*apiv1.Pod) error
+
+	// SchedulePod schedules the given Pod onto the Node with the given nodeName inside the snapshot.
+	SchedulePod(pod *apiv1.Pod, nodeName string) error


I wonder if in DRA world we need a separate method to schedule an existing pod (this one) and one to inject a completely new, in-memory pod? Basically - do we want separate method for "create pod" and "schedule pod"?

k8s-ci-robot assigned jackfrancis and MaciekPytel Nov 5, 2024

k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/cluster-autoscaler labels Nov 5, 2024

k8s-ci-robot requested review from andrewsykim and apricote November 5, 2024 16:01

towca force-pushed the jtuznik/dra-snapshot-cleanup branch from c7d18df to e0d1e60 Compare November 5, 2024 16:07

k8s-ci-robot requested a review from pohly November 6, 2024 12:08

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 6, 2024

jackfrancis reviewed Nov 6, 2024

View reviewed changes

jackfrancis reviewed Nov 7, 2024

View reviewed changes

towca force-pushed the jtuznik/dra-snapshot-cleanup branch from e0d1e60 to 4a7d702 Compare November 7, 2024 15:04

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 7, 2024

towca added a commit to towca/autoscaler that referenced this pull request Nov 7, 2024

TMP squash: kubernetes#7466

70ac10c

DONOTSUBMIT

towca mentioned this pull request Nov 7, 2024

CA: refactor utils related to NodeInfos #7479

Open

towca added 5 commits November 13, 2024 13:18

DRA: remove AddNodeWithPods from ClusterSnapshot, replace uses with A…

1335552

…ddNodeInfo We need AddNodeInfo in order to propagate DRA objects through the snapshot, which makes AddNodeWithPods redundant.

DRA: remove redundant IsPVCUsedByPods from ClusterSnapshot

52d5026

The method is already accessible via StorageInfos(), it's redundant.

towca force-pushed the jtuznik/dra-snapshot-cleanup branch from 4a7d702 to 3556f27 Compare November 13, 2024 12:18

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 13, 2024

towca added a commit to towca/autoscaler that referenced this pull request Nov 13, 2024

TMP squash: kubernetes#7466

8ad16c3

DONOTSUBMIT

towca added a commit to towca/autoscaler that referenced this pull request Nov 14, 2024

TMP squash: kubernetes#7466, kubernetes#7479

36be2ce

DONOTSUBMIT

towca mentioned this pull request Nov 14, 2024

CA: refactor PredicateChecker into ClusterSnapshot #7497

Open

MaciekPytel reviewed Nov 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CA: refactor ClusterSnapshot methods #7466

CA: refactor ClusterSnapshot methods #7466

towca commented Nov 5, 2024 •

edited

Loading

k8s-ci-robot commented Nov 5, 2024

pohly commented Nov 6, 2024

jackfrancis Nov 6, 2024

towca Nov 13, 2024

jackfrancis Nov 6, 2024

towca Nov 13, 2024

jackfrancis Nov 7, 2024

towca Nov 13, 2024

MaciekPytel Nov 14, 2024

jackfrancis Nov 7, 2024

towca Nov 13, 2024

towca commented Nov 13, 2024

MaciekPytel Nov 14, 2024

CA: refactor ClusterSnapshot methods #7466

Are you sure you want to change the base?

CA: refactor ClusterSnapshot methods #7466

Conversation

towca commented Nov 5, 2024 • edited Loading

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Nov 5, 2024

pohly commented Nov 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

towca commented Nov 13, 2024

Choose a reason for hiding this comment

towca commented Nov 5, 2024 •

edited

Loading