Ephemeral Control API #5136
Replies: 5 comments 6 replies
-
IMHO I think these solutions doesn't fit:
While I think we should focus on these: K8S API AggregationI honestly didn't knew at all about this feature of k8s, from a quick look it seems to me somehow similar to how admission webhooks works... Even if our controllers doesn't support it, maybe can we reuse some code from webhooks to implement it?! It's definitely worth investigating. Did you already made some investigation around it? If we could have an endpoint that is already connected and authenticated following the k8s flow and service accounts, and we just need to care about implementing the api, then I think we can achieve a great result, like:
But again, I need to read a bit more about it to understand if I'm just dreaming 😄 New CRDsWhat I like here is:
I think the trick here is to create a CRD which has the same semantics of a Job: once created, you can't remove it until it's done nor you can mutate it. Because we can have these "commands" CRDs "implementation specific", this also avoids modifying the
If I have to choose between these 2 approaches, I would prefer the K8S API Aggregation, but again, we need to understand its feasibility... |
Beta Was this translation helpful? Give feedback.
-
Some key info around aggregated APIs:
|
Beta Was this translation helpful? Give feedback.
-
Interesting K8S Issue on the topic (with a final comment from @n3wscott ; ) |
Beta Was this translation helpful? Give feedback.
-
I am not sure the Aggregation API sounds feasible, b/c Paul said. Since we talk about CRDs... and the Did we think about a specific |
Beta Was this translation helpful? Give feedback.
-
After spending some time researching the K8S Aggregation API I don't think it is the right approach, at least for the current KafkaChannel Subscription Replay use case. Everything I've seen (K8S docs, apiserver-builder-alpha, Sample API Server) indicates we would be going against the grain and using this in a manner for which it was not intended. The Aggregation API is focused on allowing advanced or alternate handling of Custom Resources above and beyond what the standard CRD mechanism allows. It is still, though, focused on managing the CRUD lifecycle of a Custom Resource, whereas what we are trying to do is expose some additional "functionality" that is "related" to a Custom Resource. The closest paradigm that might work would be to add a sub-resource for our "functionality" to the Subscription Custom Resource. There are several problems with this approach including;
Additionally, the concerns provided above by Paul about CRDs and Aggregated APIs in the same API Group, and potential support from cloud providers, etc are major impediments. Also, supporting the Aggregation API requires significantly more effort (initial and ongoing) over CRDs and is probably not something to be taken lightly. If we instead choose to create a new "Request" or "Command" CRD for the Replay feature, then there's no longer a need to use the Aggregation API over the simpler CRD based approach. I'm thinking this is the most "standard" way of handling this in K8S and we should choose this approach? |
Beta Was this translation helpful? Give feedback.
-
Ephemeral Control API
The standard Kubernetes & Knative APIs use a declarative configuration (YAML) which can be a bad fit for certain ephemeral control operations. This discussion aims to standardize an approach for handling such use cases in order to achieve consistency across the Knative landscape.
The ephemeral control operation in question is one which has an immediate one-time impact on the system, for which specifying some static configuration would be out-of-date after the operation has taken effect.
A specific use case is driving this need for this capability, and will be used as a example in this discussion, but resolving the larger issue of how to handle such operations is the actual goal.
Example Use Case
Kafka Topics are persisted which allows subscribers to consume prior messages. They are also partitioned for scalability reasons, and the messages in these partitions are ordered. Individual Consumers of a partition track their own "offset" into the partition. This position tracking allows for the prevention of message loss during restarts, etc. If a subscriber experiences unexpected downtime it can reposition its offset back to an earlier time in order to recover what would otherwise be lost messages.
The eventing-kafka KafkaChannel implementation would like to expose this ability to end users so that they can perform the recovery described above and is detailed in Issue #477. The design for implementing this capability is well understood, except for "HOW" to expose this capability to the end user. Meaning, what is the API for them to request a re-position of the offset?
The operation of re-positioning the offsets is an ephemeral one-time (per usage) change to the system, which would be awkward to specify in static configuration YAML. Once the new offset (or offset timestamp) was specified in configuration,
and the offset adjusted, the value is no longer valid/relevant? In fact, if it was not removed or flagged as "completed" or "processed", it could cause an unintended additional re-positioning upon Pod restart or re-reconciliation.
Exposing a REST API on the KafkaChannel Controller which allows users to reposition their subscriptions is straight-forward enough, but it would be setting a new precedent as to how such operations are handled which might not be desirable. Therefore, I'd like to brainstorm options for handling such requests and hopefully settle on an approach.
Options (Brainstorming ; )
Controller REST APIs: Simply allowing Controllers (and/or other Deployments?) to expose a REST API as desired. Could define some specifications around the naming/pathing of operations. Is different from all other K8S / Knative configuration.
K8S API Aggregation: Kubernetes exposes the ability to extend the core K8S api with custom endpoints via the APIService resource. The K8S Docs describe the complicated auth setup required for proxying requests. Not sure if this works with existing CRD Controllers?
New CRDs: New Custom Resources could be created to capture the desire for such operations. This will require new Controllers to reconcile the CRDS. The Controllers would have to work together with the existing Controllers of the actual Custom Resources being managed (e.g. KafkaChannels) which might be awkward. These CRDs would have their own state (e.g. "Pending" or "Completed"). Questions around their overall lifecycle once completed - are they ever removed?
Treat As State: It might be possible keep such configuration in the declarative YAML by treating it as State and/or History information? The contextual problem of the "one-time operation" must be handled to prevent unintended repetitive execution of the operation.
Kubernetes Jobs: Something along the lines of K8S Jobs might be a possibility. This is similar to the New CRDs approach but instead of a Controller it is a new Deployment. This is heavy-weight requiring new build images for distinct operations, etc.
Annotations: Some minimal standards around naming of light-weight annotations. These annotations would have to be removed/flagged similar to the other solutions to prevent unintended re-execution. Would probably require new Controllers as well since the resource in question might not already have one (e.g. - KafkaChannels replay would have to watch Subscriptions since that is the granularity at which the annotations would be needed.)
Other Ideas?
Additional Considerations
Resources
Beta Was this translation helpful? Give feedback.
All reactions