OLM v1 multi-tenant (shared) clusters considerations #269

pgodowski · 2023-06-12T11:58:21Z

pgodowski
Jun 12, 2023

Want to surface the discussion which took place via emails, couple of conference calls and f2f discussions in this ticket.

The aim of this ticket is to discuss the fundamental differences OLM v1 introduces relative to OLM v0, when it comes to the support and handling the multi-tenant clusters. Specifically for large software vendors, who implemented significant number of operators (like IBM, with 250+ operators), who can be deployed in the tenant-level scope (both operator & operands), we need to understand the migration strategy from OLM v0 into OLM v1.

Glossary
Scenarios
Proposal for OLM v1 discussion

Purpose of this document is to document the terminology and enterprise use cases of IBM Cloud Pak customers, with OLM-based operators

Glossary

Cluster Admin - typically member of the IT Infrastructure Ops team, responsible to providing the Kubernetes cluster infrastructure to the invididual tenant teams.
Cluster Admins are responsible for:

setting up and upgrading Kubernetes/OCP clusters and related infrastucture (registry servers, infrastructure monitoring & scanning tools, etc)
making sure that security policies are applied across all the clusters
creating namespaces
managing the tenant admins (CRUD of users and their roles)

Tenant - the team which is part of the customer's organisation, which is independent from other teams in the same company.
Tenants, aka Lines of Business, are responsible for providing the value via deployment and usage of the business apps like IBM Cloud Paks. This aligns most closely with the Kubernetes Teams multi-tenancy use case.

Tenants are provided a the set of Kubernetes namespaces and users are granted namespace admin roles for those namespaces, plus provided roles to deploy operators into their own namespace.

TODO: There is a slight ambiguity here: if a single Line of Business has dev and test instance on the shared cluster, then we have 2 tenants, actually

Two types of Tenant namespaces are relevant here:

Control: Hosts Operators and has access to the Kubernetes API server.
Data: May be colocated with teh Control plane, but may also be further isolated. The data plane hosts the Operand CRs and does not typically have access to the Kubernetes API server.

Tenant Admin - A Kubernetes user that has permissions to administer Kubernetes resources within one or more of the Control Plane namespaces that collectively represent the tenant. These users do not have privileges to affect other tenants.

IBM Cloud Pak - the suite of logically related business applications, using operators for their deployment and lifecycle management. Each of the application might have 1 or more operators (typically one top level operator and several nested operators).

Typically:

One or more Lead Operators (represents the Cloud Pak)
One or more Capability Operators (can be part of a Cloud Pak or standalone... e.g. ibm-mq)
One or more Component Operators (typically a dependency and not useful by itself).

Separation of control-and-data - the deployment topology where tenant's application is split into separate namespaces: one for operators and one or more for their operands. Customers are setting up firewalls (network policies) to block traffic to k8s API Server from operand namespaces.

Workload isolation - the deployment pattern where multiple tenants can deploy their applications like Cloud Paks (both operators and operands) independency from each other and manage its lifecycle independent from each other. It is acceptable to have multiple copies of same operator at different version, managed by different tenant.

accepted limitations: given the K8s nature, the true multi-tenancy is not possible, thus we need to accept that control plane services (etcd, StorageClasses, etc) and some cluster-level services (like Monitoring or Cert-manager) are cluster-level services. Yet, they shall be resillient to the noisy neighbour as much as possible

Relevant Kubernetes Resources that are in scope of a Tenant Namespaced isolation:

Namespaced objects
NetworkPolicies
DNS
Access Controls
Quotas

Scenarios

Typically, tenants are provided access to one of more namespaces which are under their control. Tenants are deploying operator(s) into their namespaces. A single tenant has a single instance of any operator. It is accepted that single operator is watching multiple namespaces, as long as those namespaces belong to the same tenant.

Workload isolation scenarios

Cloud Pak operators can be installed either into AllNamespace mode (openshift-operators), meaning that there is just a single tenant on the cluster, or in the OwnNamespace mode, whenever there are more tenants on the cluster. It is expected to have multiple tenants each of them running the same Cloud Pak, most probably each at different version (aka dev/test deployments).

Operator dependencies

IBM Cloud Paks are leveraging two types of Statically Defined OLM operator dependencies:

based on the GVK
based on the package name (and version range)

IBM Cloud Paks also create Operators dynamically (via IBM Operand Deployment Lifecycle Manager ODLM), which enables auto-provisioning Operators and Operands (CRs) on-demand when required, typically for shared capabilities/components (like user identity and access management, common UI platform experience). With Cloud Pak 3.0 architecture, these shared capabilities/components are deployed as individual instance per tenant.

CatalogSource management

Currently, CatalogSources for IBM CloudPaks are deployed by the Cluster Admin as global catalogs in openshift-marketplace, but it leads to issues when CatalogSources are updated, causing uncontrolled operator upgrades across tenant namespaces. Mitigations leverage catalog source pinning, usage of manual approval mode and exploration of usage of private (namespaced) CatalogSources (each tenant having its own CatalogSource vs its own).

Proposal for OLM v1 discussion

Introduction of the API Bundles

De-couple the API Bundle from the Operator controller bundle. Have a semver-versioned API Bundle which is cluster-scope and will register only CRDs and their conversion webhooks (if needed). The Operator controller bundle can be deployed wither in All Namespace mode (openshift-operators) or into each of the tenant namespace. It is acceptable for controller operators to be deployed into multiple namespaces, each at different version. There shall be a way how controller operator would define a compatibility version range on the API Bundle(s). Controller code by itself would be responsible for making sure the individual CRs are properly structured (TBD whether validation webhooks are really required) and react accordingly by providing a proper .status updates.

API Bundle shall provide backwards and forwards compatibility as much as possible - and ideally there shall be a tool / method of validation the CRD evolution.

Ideally there shall be some migration tool (part of operator-sdk) which would take existing OLM v0 bundle and separate into two OLM v1 bundles: one with APIs (CRDs) vs the one with the rest of the code and properly definition the dependency relationship. Perhaps even it could be automatically executed for backwards compatibility with OLM v0 operators, on the OLM v1 OCP clusters.

RBAC

There shall be a way in OLM v1 to define the RBAC for the tenant-level operator, based on the prescriptive input (list of namespaces) which defines the topology of the given tenant (sort of OperatorGroup, which defines the WATCH_NAMESPACES for controller and namespaces to create RBAC for). Something like IBM Namespace Scope operator or Oria operator. Whenever tenant is just a single namespace, there shall not be required any topology definition - defaults should be assumed and RBAC properly created based on the operator controller bundle metadata. Tenant admins shall delegate to OLM v1 the CRUD of RBAC, based on the operator metadata and topology definition. Such RBAC shall be easily auditable - via few kubectl commands.

Customers would deploy the actual controller operators, which should automatically load related API Bundles (TBD compatibility checking).

Dependency management

Dependencies (TBD whether we need them) shall deploy dependant operators in the same mode and namespace as the requesting operator, AND configure RBAC using the same topology definition.

Dependency resultion should be executed in the scope of the tenant (one or more namespaces).

Catalog and Subscription management

CatalogSource shall be tenant-level.

Update of CatalogSource shall not impact other tenants.

There shall be still concept of Subscription which allows subscription to the fixes. There shall be some equivalent of approval mode, but perhaps not working (like in OLM v0) on the level of namespace (like InstallPlan), but rather on the level of tenant (set of namespaces).
There shall be a way to preview the available upgrade and what's involved with the upgrade (i.e. whether any additional cluster-level dependencies are introduced?)

Related Info

TODOs

TBD what shall be the definition of Channel - channels group all the fixes and updates to the semver-compatible version range
TBD relationship between Channels and API Bundles, if any
TBD schema evolution checking / validation

pgodowski · 2023-06-12T11:59:01Z

pgodowski
Jun 12, 2023
Author

@ncdc @joelanford pls share with the team

0 replies

Jamstah · 2023-06-12T13:29:50Z

Jamstah
Jun 12, 2023

Was this supposed to be here? https://github.com/operator-framework/operator-controller/discussions/140

0 replies

pgodowski · 2023-06-12T16:37:57Z

pgodowski
Jun 12, 2023
Author

I think https://github.com/operator-framework/operator-controller/discussions/140 is related (as it speaks about multi-tenant clusters), yet still different, as focused on subset of use cases, specifically in this ticket we are not focused on namespaced CatalogSources at all, neither as you know we don't use namespaced CatalogSources, to have OLMv0 multi-tenant deployments.

In this ticket I want to elaborate how could we move forward from the operator controller & operand modelling perspective (API Bundles, RBAC mgmt, how tenant versions are controlled, etc).

0 replies

joelanford · 2023-06-12T18:42:40Z

joelanford
Jun 12, 2023
Maintainer

Reading through the Kubernetes multi-tenancy doc, a few things stand out:

The Operators section points out operators should:

"Support creating resources within different tenant namespaces, rather than just in the namespace in which the Operator is deployed"

This seems to directly contradict the assumption being made in this doc:

It is accepted that single operator is watching multiple namespaces, as long as those namespaces belong to the same tenant.
The Virtual control plane per tenant section, specifically says it should be used when it is necessary to:

enable segmentation of cluster-wide API resources.

It further goes on to say:

By using per-tenant dedicated control planes, most of the isolation problems due to sharing one API server among all tenants are solved. Examples include noisy neighbors in the control plane, uncontrollable blast radius of policy misconfigurations, and conflicts between cluster scope objects such as webhooks and CRDs. Hence, the virtual control plane model is particularly suitable for cases where each tenant requires access to a Kubernetes API server and expects the full cluster manageability.

IMO, we're fighting an uphill battle with any architecture that attempts to provide the illusion of multi-tenancy when, in fact, there is none.

I think a better way to describe the proposal here would be "coordination of cluster-scoped APIs for multiple tenants sharing a control plane". This is control plane single-tenancy with a coordination layer. How does that strike folks? Does that framing seem more or less correct?

To me, its about more than just technical possibilities and API/controller split. It's about the direction of the kubernetes community, the expectations of users, the assumptions of operator authors, the interactions tenants have with control planes, the interactions control planes have with tenants, and the larger set of use cases we're trying to solve in OLMv1.

I'm honestly somewhat worried that we would build something to solve the coordination problem, but then this scenario happens:

Cluster admin installs APIv1
Namespace admin A installs a controller in their namespace depending on APIv1
Namespace admin B installs a controller in their namespace depending on APIv1
Namespace admin A wants to upgrade their controller, which depends on APIv2 or greater.
OLM says, "sorry, nope"
Fingers are directed at OLM for being too restrictive. Requirements are written that essentially ask OLM to change their entire architecture to support different versions of the CRD being available for admins A and B. The only answer we really have is "you need to use a different control plane".

But if that's the eventual outcome, is all of this extra complexity we place on ourselves and our users worth it?

0 replies

ncdc · 2023-09-07T15:11:41Z

ncdc
Sep 7, 2023
Maintainer

I've been thinking about this some more, and I have some ideas for discussion.

As Joe mentioned, Kubernetes does not afford true multi-tenancy. Anything we do today must recognize that.

If one wants a tenant to be entirely self-sufficient; namely, they never need to ask a cluster admin for help installing an operator into a tenant's namespace, I can come up with these options:

Give each tenant their own cluster, and give them cluster-admin. I assume this is a non-starter in many situations.
Within a shared cluster, configure OLM in "less-secure mode", where OLM acts as cluster-admin. When a tenant creates an Operator CR in their namespace, OLM happily uses its cluster-admin access to perform any and all CRUD operations wherever necessary. Specifically, OLM can create CRDs, namespaces, cluster roles, etc. This is essentially "insecure mode"; effectively, a tenant can deploy an Operator that contains a cluster role binding that grants said tenant cluster-admin. As such, multi-tenancy and namespace boundaries only work if everyone plays nicely together. Multi-tenancy as a security boundary is merely an illusion.
Within a shared cluster, OLM has no elevated privileges. Tenants must be able to create service accounts that have all the permissions necessary to install an operator, including the ability to create CRDs. This option doesn't really count as "never needing to ask a cluster admin for help," but it's probably a better middle ground between options 1 and 2.

Regardless of what option(s) we offer, I do think we can probably find ways for multiple Operators to co-own CRDs. As long as developers don't make breaking API changes in newer releases, OLM can orchestrate when to apply CRD updates as different operator versions are installed.

7 replies

pgodowski Sep 7, 2023
Author

BTW - I don't yet fully understand how option 2 differs from the de-scoped operators plan?

ncdc Sep 7, 2023
Maintainer

My post above is focusing on how to keep the APIs namespace-scoped; it doesn't address or take into account configuring watching all namespaces or specific namespaces. Its primarily aimed at evaluating options for tenant permissions.

ncdc Sep 7, 2023
Maintainer

Or did you mean something else by de-scoping?

cdjohnson Sep 7, 2023

I agree with these three options. Option 2 is essentially re-adding in scoped operators to the design which is very OLM v0. If you were to do this, preserving the function we have now, I think it would be important to add some more safeguards to allow these operators to declare that they "cooperate" with other instances of themselves, and evolve the shared resources in a compatible way (e.g. add semver rules to a CRD)

ncdc Sep 7, 2023
Maintainer

@joelanford and I are going to do some brainstorming on an "apiserver proxy" that would ensure you couldn't have 2 operators that having overlapping watches on the same namespace. Not sure it's entirely feasible, but we do want to explore the idea.

stevekuznetsov · 2023-09-28T17:05:22Z

stevekuznetsov
Sep 28, 2023
Maintainer

Approach	Description	Pros	Cons
Server-Side Fan-out	Controllers talk to an API proxy when they do LIST + WATCH calls; authorization info that identifies the client and cluster configuration allows the proxy to know which set of namespaces each client should see. Clients are agnostic to this approach, always issue cluster-scoped calls. The proxy fans out one cluster-scoped client call to N namespace-scoped calls at runtime.	- server does not need to hold cluster state in memory - server logic is fairly simple - clients are not aware of the transform; any process that does cluster-wide calls will work - can re-use normal k8s RBAC	- total ordering requirement means that we might have high (seconds to minutes) latencies for watch event delivery - fan-out may be less efficient than a single cluster-wide call at the API server - memory footprint will be non-zero, as we need to hold on to non-disbursed events
Server-Side Filtered	Controllers talk to an API proxy as above; instead of fanning out to N namespaced calls, though, filtered content is served from a cache.	- may be more performant than fan-out - can deliver watch events with minimal latency - clients are not aware of the transform; any process that does cluster-wide calls will work	- server needs to hold cluster state in memory - server logic is fairly complex - can’t re-use normal k8s RBAC easily - impossible to support LIST with explicit, old RVs without ballooning the size of the cache
Client-Side Fan-out	Controllers understand which namespaces they are installed into and use a library to fan-out calls to N informers; these are aggregated back to one informer before their events are consumed for work queues or their content is exposed to listers.	- can re-use normal k8s RBAC - client logic fairly simple, can be hidden in lower levels of k8s client machinery	- clients are aware of the transform; need to be re-compiled to support it - new error domains possible (one of N listers is not (re-)synced yet)

5 replies

stevekuznetsov Sep 28, 2023
Maintainer

@cdjohnson after some thought, here are options we've considered, along with their pros and cons. Would love to hear your inputs on this.

pgodowski Sep 29, 2023
Author

Thank you for the continued effort in trying to address the provided scenario.

Is there a way we could represent those options in a visual way? (where we have CRDs, namespaces, CRs, operator bundles, operators/controllers, API server and perhaps the unique elements for each of the 3 options provided (api-proxy, client library))

Which exact point the provided options solve? My interpretation is that they solve the way how the client is authorized into the subset of namespaces and then how the get/list/watch calls are being handled, at the same time somehow addressing the overlapping scopes (not sure how though)?

My interpretation of each option, pls let me know I'm close enough:

I think option 3, where we make the clients being aware of their context and providing a library which helps with getting the resources within their scope; if I use the analogy with OLM v0, it would be something similar to the MultiNamespace install mode, but in a way that a client library to be used sends N requests against each of the N namespace in scope and then aggregating them back?

Option 1 does also the same thing as subject client library, but does this on behalf of the client, based on the auth info client sends as part of the request (api-server know who calls it, checks its list of authorized namespace and then submits N requests and aggregate the response)

Option 2 is optimization of Option 1

stevekuznetsov Sep 29, 2023
Maintainer

@pgodowski yes, this is primarily just looking at the mechanism for authorizing and routing data to clients. There are no broader assumptions on the system - so, we expect that some set of CRDs will exist, some set of operators are installed on the cluster and at least a part of those should see data only in a few namespaces. Whether or not overlap is allowed is ostensibly a concern of the system by which operators are recorded as being scoped to namespaces.

Given that system of record for scoping, the table is narrowly looking at what to do with that information - can we use it to provide transparent tenancy or must we expose it to clients and expect them to do the right thing? You are correct that option 3 is analogous to the MultiNamespace mode in OLMv0. Options 1 and 2 are different takes on how to make this process invisible to clients - if we were required to scope operators that we did not control the codebase for, and could not fork them, these options would be valuable.

pgodowski Sep 29, 2023
Author

Thanks for the confirmation.
To some extend then, this is more about how to make the scoped information exposed to the consumers, perhaps without having the consumer code rewrite, got it. Keep in mind though, that no rewrite is needed, if OLM v0 multi namespace-aware controller is already in place. So I'm trying to see which codebase we'd be looking to migrate from and into - to have such discussion we need to have the system view too, and thus asked for the system visual. Other than that, I understand now the three options provided - thanks!

stevekuznetsov Sep 29, 2023
Maintainer

@pgodowski interesting - if you are exclusively looking at codebases that already support v0 multi-namespace mode client-side, then I think the value of options 1 and 2 is very low. That's good information, thank you for the clarification.

ncdc · 2023-10-25T16:37:02Z

ncdc
Oct 25, 2023
Maintainer

FYI, a few of us have been brainstorming how to improve namespace-scoped operators, try to do multi-tenancy as safely as possible, etc. The working doc is https://docs.google.com/document/d/1xTu7XadmqD61imJisjnP9A6k38_fiZQ8ThvZSDYszog/edit.

0 replies

ncdc · 2024-01-22T15:44:30Z

ncdc
Jan 22, 2024
Maintainer

After ruminating on this over the holidays, I don't believe that trying to program logic for "multi-tenancy" is feasible. I am not saying, however, that multi-tenancy with OLM is not possible. If you know what you're doing and you want it, you can. Please continue reading for much more detail.

We have been using the term "multi-tenancy" to mean a few things. In particular, these are the use cases that are primarily associated with it:

A non-admin user can install or update an operator in a namespace, self-service style, without requiring any administrator intervention.
It is possible to install multiple copies of the same operator and they won't conflict in any way.
It is possible to upgrade an individual copy of an operator and it won't impact any other copies or users of said other copies.

Let's start with 1. OLM v0 makes this possible because it has full cluster-admin permissions. OLM acts as a deputy, using its super powers to perform operations that a non-admin cannot do. This includes installing CRDs, creating ClusterRoles and ClusterRoleBindings, and anything else that is typically a more privileged operation that a namespace admin is allowed to perform.

If we want to design a secure system, that doesn't grant privilege escalations as a deputized cluster-admin, we have to start by removing OLM's cluster-admin permissions. By using the principle of least privilege, we strip OLM's permissions down to strictly what it requires; namely, reconciling ClusterExtensions (read + update status, generally). Any action OLM performs at a user's request (e.g., install an operator by way of a ClusterExtension) must be done using credentials the user has access to. This is where Kubernetes already has an established pattern to use: the service account. Anyone who can create a pod implicitly has access to all the service accounts in that namespace. We should do the same with OLM: anyone who has permission to create an Extension in a namespace has must designate a service account on the Extension, and OLM must use that service account when creating/updating/deleting manifests associated with that Extension.

This necessarily means that someone, or something, must grant all the permissions needed to manage the lifecycle of the operator to the user creating the Extension (or create a service account with the permissions). This is the way to keep OLM secure by default. Those wanting a slightly less secure-by-default experience could conceivable write a controller that automates a deputization process and eliminates the need to ask a cluster-admin to either do the install, or grant the permissions. In other words, there could be an automation that creates the appropriate RBAC permissions and binds them to a service account whenever an Extension is created. You'd probably also want to have some sort of configuration around policy - which users can create Extensions all the time, which ones need review + approval, etc.

Now on to 2 and 3 (install/upgrade multiple copies of an operator). The API space in Kubernetes is global and singular: it is not possible to install multiple copies of the same API simultaneously. It is not possible to serve one set of APIs to one user, and a different set of APIs to another user. It is not possible to serve one "form" of an API to one user, and a variation of the form to a different user. These truths apply to both built-in APIs and APIs that are added to Kubernetes via CRDs and aggregated API servers.

We debated multiple “creative” solutions to work around these truths. One of these was the idea of trying to manage shared ownership of CRDs. Assuming all the appropriate permissions are in place, whenever an additional copy of an Extension is created, OLM could determine the “latest” version of the manifest for a CRD, and apply that. Unfortunately, this is impossible for multiple reasons:

There is no guarantee that every Extension that includes the CRD is versioned consistently in a manner that allows OLM to determine, with absolute certainty, which version of the Extension is the “latest”. Not all operator authors use semantic versioning, meaning we wouldn’t know how to sort multiple versions. Even for those that do, there is always the possibility that someone applies an Extension with a non-comformant version (hotfix-20240122).
Even if we could determine the latest version, it is not possible to determine, with absolute certainty, that a newer iteration of a CRD is 100% compatible with what is currently on the cluster. While we can perform some checks to compare schemas, it isn’t possible to compare them statically and know they are compatible. If a CRD has validating, mutating, and/or conversion webhooks, we can’t determine if upgrading those webhooks would cause incompatibilities until after we have upgraded the webhooks and then run existing and/or new data through them. If we ignored the webhook issue and coded a “best effort” to prevent, as much as possible, incompatibilities we could successfully detect, there is still the chance that “the user knows better” and decides they want to upgrade regardless of OLM’s findings. We could allow a force/override, but then we’re back to the global, singular API space. If one user decides to accept the risk and upgrade, it’s entirely possible the upgrade works for them, but breaks other users in different namespaces.

Based on the findings above, I’d like to propose the following paths forward:

The Extension API becomes namespaced, and namespaced only.
You must specify a service account name in the Extension spec. OLM uses this service account when managing the lifecycle of an operator (CRUD operations).
You must have permissions to create all the RBAC necessary for the service account to install/upgrade/delete all the manifests for an operator. If you don’t have permissions to do this, you’ll either have to ask someone else to do all of this for you (create RBAC + Extension), or someone else will have to grant you permissions to do the work yourself.
You have to have RBAC permissions to create an Extension. If you do, you implicitly have access to all service accounts in the namespace, just like when you have permission to create a pod.
1 and only 1 Extension can own a given Kubernetes object. This means if you want to install multiple copies of an operator, you must put anything that is cluster-scoped (e.g. CRDs) into 1 or more separate bundles, distinct from the bundle that contains the controller. You would install the cluster-scoped bundle(s) once, and 1..n copies of the controllers.
We may decide to implement some CRD upgrade safety checks. If we do, they are only to be considered an initial screening but not comprehensive and not a guarantee that an upgrade is safe. We may block an upgrade by default if we detect any incompatibilities. If we do, we’ll provide a mechanism to force the upgrade and ignore the identified issues.

0 replies

Jamstah · 2024-01-22T16:01:43Z

Jamstah
Jan 22, 2024

1 and only 1 Extension can own a given Kubernetes object. This means if you want to install multiple copies of an operator, you must put anything that is cluster-scoped (e.g. CRDs) into 1 or more separate bundles, distinct from the bundle that contains the controller. You would install the cluster-scoped bundle(s) once, and 1..n copies of the controllers.

The joining of CRDs and Controllers into one package is one of the key things that has made managing operators difficult.

My take is that we should be trying to unwind that, and asking users to install APIs and Controllers separately, every time.

If a tenant on a cluster wants to install an "Operator", first they need to use admin permissions to install the API package (or get someone to do it), then they can install the controller package themselves for their own namespace.

The API package should only be allowed to contain a CRD, and a conversion web hook image which OLM could run in a completely constrained environment (no k8s api access, no outgoing network access).
The Controller package should be allowed to contain more, including validating/admission webhooks that OLM can verify are only valid for the appropriate namespaces. The controller package should have a dependency on the API package that allows OLM to verify that they are compatible.

We could generate the two packages quite easily in existing projects by updating the operator-sdk to generate them for the catalog. An administrative end user could still just ask to install a Controller and have everything else done. A tenant would get an error about the API package not being installed.

OLMs job would then be to manage the dependencies between them.

3 replies

ncdc Jan 22, 2024
Maintainer

Validating and mutating webhooks must be global and included in the API package.

Jamstah Jan 22, 2024

Potentially, but there are many controller specific reasons to have validating and/or mutating webhooks so separating them would be good where possible.

If we use OLM as the proxy to apply them to the cluster, we could ensure they have the right namespace selectors, is one option.

ncdc Jan 22, 2024
Maintainer

The namespace selector field in validating/mutating webhook configurations is not meant to run different validating/mutating logic per namespace. It is meant to allow a "system" type of namespace to opt out of validations/mutations that would prevent the overall API from behaving properly. We should not pursue encouraging webhooks per namespace. cc @deads2k to keep me honest here.

Jamstah · 2024-01-22T16:02:57Z

Jamstah
Jan 22, 2024

We may decide to implement some CRD upgrade safety checks. If we do, they are only to be considered an initial screening but not comprehensive and not a guarantee that an upgrade is safe. We may block an upgrade by default if we detect any incompatibilities. If we do, we’ll provide a mechanism to force the upgrade and ignore the identified issues.

CRD upgrade safety checks belong in the operator-sdk api build and scorecard, we should try and catch these issues before they ever hit the cluster. Having a second layer of defense on cluster isn't a bad thing though.

11 replies

ncdc Jan 22, 2024
Maintainer

I think we're agreeing albeit with different words 😄

Jamstah Jan 22, 2024

I think so :)

Another thing splitting it out gives us, is perhaps we could enforce semver on the API package and allow the pesky feral operator devs to use whatever they like for the controller packages.

yannizhang2019 Jan 22, 2024

We need clarification on the terminologies :-) , and so we are at the same page and can possibly map what these new concepts will provide in OLM v1 to something close in OLM v0 to handle the similar functions.

grokspawn Jan 22, 2024
Maintainer

Trademarking the phrase ... "pesky feral operator devs". 😜

ncdc Jan 22, 2024
Maintainer

I'm not (yet?) convinced that we need 2 different APIs here. I am suggesting there is only 1 API, called Extension. Some extensions result in CRDs and other cluster-scoped things getting installed. Other extensions result in deployments for controllers being created.

Also, just to be explicit, we are not trying to replicate what v0 does. There are parts of v0 that fall into the "we never should have done that" category 😄.

pgodowski · 2024-01-24T07:33:47Z

pgodowski
Jan 24, 2024
Author

Lets discuss more a concept of API packages being separated. I think this might be a key to separate issues into 'smaller' problems, or at least different approaches might be defined for each sub problem...

Why cannot we say that API packages MUST follow semver otherwise wont be applied into a cluster? And then, have a process at deploy time doing the CRD evolution checking.

Fron there, there needs to be a discussion about mutating and validating webhooks, which perhaps we reduce the usage, in favor of doing similar logic in the application layer (controller code)? For validation, its the .status update change.
For mutating/convertion webhooks we need to think about this. But the key observation is that above proposal from James imply namespaced scoped webhooks, which could be elaborated to be replaced by a filtering mechanism in a controller code directly, but need to think more about the consequences.

12 replies

ncdc Jan 24, 2024
Maintainer

We have a .spec.version field in many of our CRs that allows the user to tell the operator "Install this marketting version of the product". Each controller version supports a different set of operand versions.

Is the validation here to ensure the user-specified version is appropriate, rejecting otherwise?

We have a license accept field which must be true, and if its not, we don't admit the CR.

This is a great use either OpenAPI (required, enum with 1 value) or CRD CEL.

Do you have examples where you have 1 CRD shared by multiple operators where the validation differs based on operator and/or namespace?

Jamstah Jan 24, 2024

Is the validation here to ensure the user-specified version is appropriate, rejecting otherwise?

Exactly.

This is a great use either OpenAPI (required, enum with 1 value) or CRD CEL.

I should have said, there is a .spec.license.accept boolean, and a .spec.license.license field. The exact license being accepted must be valid for the configuration (production/non-production, version, can affect feature availability, etc). Potentially achievable in CEL, but even then those admission policies are cluster scoped with namespace selectors still (I could be missing something).

The examples are all different operator versions rather than an API shared by two operators - versions 1.1.1 and 1.1.2 might use the exact same API, but 1.1.2 can support a new feature/new operand version etc. I don't see many shared API cases outside of Ingress and maybe CertificateRequest from cert-manager.

ncdc Jan 24, 2024
Maintainer

This is the per-CRD CEL validation: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules

Jamstah Jan 24, 2024

Yes, but that's really an extension of the OpenAPI schema validation which is also cluster scoped. We have use cases that require resource validation at admission that depend on the operator version, which we could do with CEL using ValidatingAdmissionPolicy (when it becomes stable), which is the CEL I'm talking about.

ncdc Jan 24, 2024
Maintainer

Ok, I understand what you're after now. Thanks for clarifying.

ncdc · 2024-01-24T19:19:18Z

ncdc
Jan 24, 2024
Maintainer

Splitting out #269 (reply in thread) in a new thread:

I think it helps to split the concerns into "Admin must do this" and "Tenant can do this".

I don't think that the controller is a Cluster Extension. The CRD is what extends the cluster, the controller doesn't.

Personally, I would have an API resource and a Controller resource. We don't have to represent the operator as a resource, it is the logical grouping of those two.

Naming is obviously hard 😄. If we take the meaning behind the word "extension" out of the equation, do you think that conceptually it makes sense to have 2 distinctly named APIs that both, at the end of the day, represent a declarative expression for getting yaml from a package installed on a cluster? Besides the name, are there differences, in either spec fields or functional behavior, that would justify 2 different APIs? That's what I'd like to see us come up with before going down the two-APIs path.

27 replies

ncdc Jan 26, 2024
Maintainer

If you don't want users to have access to the service account, you can create the extension and service account in a different namespace, and configure it to watch the namespace(s) you care about. That keeps things separate.

Jamstah Jan 26, 2024

Yep, that works for separating operators from operands, which is definitely the way to go.

What it means though is that a tenant can never install an operator with a cluster scoped component, because once that permission is assigned to the service account, the tenant can use it.

We briefly touched on the nature of cluster scoped resources in the meeting and I think @joelanford said that it's possible to use a naming scheme to separate cluster scoped resources and allow two different operators to include a cluster scoped resources without conflicting (he didn't comment on if he thought it was a good idea!).

That leaves us in the position where tenants cannot install an operator with a webhook. I know opinions differ on if that's a requirement, but I would argue it is.

There are other options, like additional capabilities alongside the carvel, of course.

stevekuznetsov Jan 29, 2024
Maintainer

What it means though is that a tenant can never install an operator with a cluster scoped component, because once that permission is assigned to the service account, the tenant can use it.

In what way is this different from the confused deputy issue with OLMv0? If the user has permissions to create a Subscription+OperatorGroup or ClusterServiceVersion, they (by proxy) have permission to do those same things at the cluster scope.

stevekuznetsov Jan 29, 2024
Maintainer

Notably, in the PermissionRequest API we showed at the end of last year, you can see Kubernetes RBAC used to specify that a user may install and update one and only one CRD, by specifying CREATE and PATCH permission with a specific object name, as long as the management system that's using those credentials uses server-side apply to manage the resources. So, allowing a user access to a ServiceAccount that can manage one CRD does not mean the user has access to CRDs writ large in the cluster.

stevekuznetsov Jan 29, 2024
Maintainer

I guess I'd ask - is the goal state you have in mind w.r.t. permissions something that's possible to model using Kubernetes RBAC? If not, it's critical to understand that OLMv0 and v1 must have a superset of that privilege and thereby always will be susceptible to privilege escalation attacks as a confused deputy. What the Operator Framework engineering team is trying to convey with the recent conversations is that modeling something at odds with Kubernetes tenancy or permissions primitives inside of OLM always leaves something to be desired - either implementations that operate on fundamentally shaky ground, or security vulnerabilities, or enormous maintenance burden.

I understand it's difficult to square that stance with the fact that OLMv0 seemingly provides some API surface that mutates the Kubernetes primitives away from what they really allow, but the fact that the OLMv0 user-base has been small enough not to fall prey to the issues we know lurk under the surface is simply a happenstance. When you're considering how you're modeling your system and interactions, and if you come to the conclusion that the system cannot be modeled using the Kubernetes permissions primitives, I guess I'd ask - is it really a positive outcome to reproduce the fundamentally incompatible design in OLMv1, or would it be better to factor our systems to comply with how Kubernetes is supposed to be used?

pgodowski · 2024-01-30T21:55:22Z

pgodowski
Jan 30, 2024
Author

Have a question related to the adoption of kapp-controller as part of OLMv1. I think only after I studied kapp-controller I got why the current proposal looks how described by Andy earlier...

It looks we are saying that deployment of k8s resources (CRDs, operator Deployment, other stuff part of the bundle) will be done via kapp-controller, via impersonated ServiceAccount specified in the CR representing the 'operator'- the current name is Extension CR. And then, perhaps Extension can be only API-only-bundle or could have a controller code, with the watched namespaces specified in some parameter in such Extension CR. It is would be up to the user creating Extension to make sure ServiceAccount having sufficient rights.

If all written above is true, let me ask what is the value-add of OLMv1 vs direct adoption of kapp-controller or Helm charts?

Can you please clarify what is in the OLMv1 scope:

upgrades: specify semver ranges for the upgrades, within the provided Catalog, being namespace scoped?
preview of required Roles/ClusterRoles for a given Extension?
(privileged) controller which creates required Roles/ClusterRoles for Extension?
dependency resolution (at least between API bundles and controller bundles)?
CRD evolution checking, basic checks for CRD compatibility updates?
conversion webhooks being part of the API bundle?
any operand awareness (operand versioning, operand ServiceAccount if required)?

Thanks!
@ncdc

4 replies

stevekuznetsov Jan 30, 2024
Maintainer

I can't speak to your list but I think the largest value-add we see is in the content in the catalog - both from a perspective of listing what is available, providing it known provenance, and exposing supported upgrades. Those would all be net positive on top of other application methods. @joelanford and @ncdc can likely be more specific to your list.

pgodowski Jan 30, 2024
Author

the largest value-add we see is in the content in the catalog - both from a perspective of listing what is available, providing it known provenance, and exposing supported upgrades

Thank you.

Helm charts repository provides a visibility of what is available & provenance. I take the supported upgrade paths visibility being a value add, which speaks partially though to the first point from the list.

ncdc Jan 31, 2024
Maintainer

For clarification, it's not an impersonated ServiceAccount. kapp uses the TokenRequest API to get a real token for the ServiceAccount, and uses that to authenticate.

pgodowski Feb 5, 2024
Author

Thanks. Can you @ncdc @joelanford @tlwu2013 et al, please comment more about the bullet point list of what's in and not in the OLMv1 scope?

pgodowski · 2024-01-31T09:56:43Z

pgodowski
Jan 31, 2024
Author

Opening another thread on another proposal which I'd like to discuss.
It builds on top of some assumptions discussed earlier:

kapp-controller concepts used (it will deploy k8s manifests into the cluster, via impersonated ServiceAccount)
separated API-bundles (cluster extensions) vs controllers (tenant scoped)

It simply uses different ServiceAccounts to deploy CRD bundles vs controller bundles. What we could and should discuss is how to make the cluster admin aware about CRDs being deployed, what's the change management, approvals, etc required for this.

I know the diagram might be a bit unclear at this point, as didn't have much time to put this in writing, apologies, will refine as we go - will upload shortly its draw.io sources for any edits.

older version of the graphics

@ncdc @joelanford @stevekuznetsov et all (sorry if you felt omitted, not intentional)
@Jamstah @yannizhang2019 @cdjohnson

12 replies

stevekuznetsov Jan 31, 2024
Maintainer

It would be the identical process as the tenant would use to determine which credentials they would use to make any other action against the Kubernetes API - critically, they are not choosing from a set of credentials which are not theirs. They choose from a set of credentials they otherwise have access to. If a bundle contains both a CRD and a controller, then only users who have the privilege of creating CRDs and Deployments may install that bundle.

ncdc Jan 31, 2024
Maintainer

@pgodowski if you're the person creating the Extension, and it's an all-in-one CRDs+controller, you'd be able to know what permissions are required, and since it's an all-in-one, you can only specify a single ServiceAccount. You either then use a ServiceAccount you have that is already bound to everything needed, or you ask your friendly cluster-admin to help.

The ServiceAccount for the Extension is just used to install/update/delete things. The bundle that contains the content that gets applied to the cluster (deployment, service, etc.) should have its own ServiceAccount + Role + RoleBinding inside the bundle itself. That is what the deployment will use to run.

pgodowski Jan 31, 2024
Author

Ok, so how about controller only Extension having a dependency on API-only Extension. Who will create the API-only Extension as a result of dependency resolution and using which ServiceAccount? I feel like we're oversimplifiying the distinction between what cluster admins do vs tenant admins should be self service within tenant boundaries, but saying the OLMv1 is not operator aware deployment tool, loosing the value-add.

The ServiceAccount for the Extension is just used to install/update/delete things. The bundle that contains the content that gets applied to the cluster (deployment, service, etc.) should have its own ServiceAccount + Role + RoleBinding inside the bundle itself. That is what the deployment will use to run

Sure, but such ServiceAccount is created by the tenant's ServiceAccount in the first place, thus the tenant's ServiceAccount must have RBAC not only to create other ServiceAccounts,Roles, RoleBindings but also RBAC matching the ones defined in Role(s) being part of the bundle.

pgodowski Feb 5, 2024
Author

I think I understand the proposal here for ServiceAccounts. Can we please talk a bit about the dependency resolution?
We really want to push hard(er) for separting API-bundles from the controller-bundles, therefore the dependency resolution between controller-bundles and API-bundles is something we need to talk a bit more.

everettraven Feb 5, 2024
Maintainer

@pgodowski My current understanding of our plans for dependency resolution is that if all dependencies are not met then the installation will fail and it will be up to a user to resolve the dependency failures. My understanding is that this will allow us to keep the core functionality of OLMv1 more simplistic in this aspect, but could build client tooling that makes it easier from a client perspective (i.e automatically install dependencies before installing the extension).

IIRC we already have support for bundle dependencies being specified and honored in OLMv1 when attempting to install an extension. I don't envision the dependency resolution between a controller-bundle and an API-bundle being any different than the current dependency resolution. If i'm understanding the intention correctly the two bundles should look identical to the resolver and as long as there is an appropriate dependency linking between them it should be able to handle that scenario.

cc @joelanford since we briefly chatted about some dependency/client side interactions recently

OLM v1 multi-tenant (shared) clusters considerations #269

Glossary

Scenarios

Workload isolation scenarios

Operator dependencies

CatalogSource management

Proposal for OLM v1 discussion

Introduction of the API Bundles

RBAC

Dependency management

Catalog and Subscription management

Related Info

TODOs

Replies: 14 comments · 81 replies

pgodowski Jun 12, 2023 Author

pgodowski Jun 12, 2023 Author

joelanford Jun 12, 2023 Maintainer

ncdc Sep 7, 2023 Maintainer

pgodowski Sep 7, 2023 Author

ncdc Sep 7, 2023 Maintainer

ncdc Sep 7, 2023 Maintainer

ncdc Sep 7, 2023 Maintainer

stevekuznetsov Sep 28, 2023 Maintainer

stevekuznetsov Sep 28, 2023 Maintainer

pgodowski Sep 29, 2023 Author

stevekuznetsov Sep 29, 2023 Maintainer

pgodowski Sep 29, 2023 Author

stevekuznetsov Sep 29, 2023 Maintainer

ncdc Oct 25, 2023 Maintainer

ncdc Jan 22, 2024 Maintainer

ncdc Jan 22, 2024 Maintainer

ncdc Jan 22, 2024 Maintainer

ncdc Jan 22, 2024 Maintainer

grokspawn Jan 22, 2024 Maintainer

ncdc Jan 22, 2024 Maintainer

pgodowski Jan 24, 2024 Author

ncdc Jan 24, 2024 Maintainer

ncdc Jan 24, 2024 Maintainer

ncdc Jan 24, 2024 Maintainer

ncdc Jan 24, 2024 Maintainer

ncdc Jan 26, 2024 Maintainer

stevekuznetsov Jan 29, 2024 Maintainer

stevekuznetsov Jan 29, 2024 Maintainer

stevekuznetsov Jan 29, 2024 Maintainer

pgodowski Jan 30, 2024 Author

stevekuznetsov Jan 30, 2024 Maintainer

pgodowski Jan 30, 2024 Author

ncdc Jan 31, 2024 Maintainer

pgodowski Feb 5, 2024 Author

pgodowski Jan 31, 2024 Author

stevekuznetsov Jan 31, 2024 Maintainer

ncdc Jan 31, 2024 Maintainer

pgodowski Jan 31, 2024 Author

pgodowski Feb 5, 2024 Author

everettraven Feb 5, 2024 Maintainer

Replies: 14 comments 81 replies

pgodowski
Jun 12, 2023
Author

pgodowski
Jun 12, 2023
Author

joelanford
Jun 12, 2023
Maintainer

ncdc
Sep 7, 2023
Maintainer

pgodowski Sep 7, 2023
Author

ncdc Sep 7, 2023
Maintainer

ncdc Sep 7, 2023
Maintainer

ncdc Sep 7, 2023
Maintainer

stevekuznetsov
Sep 28, 2023
Maintainer

stevekuznetsov Sep 28, 2023
Maintainer

pgodowski Sep 29, 2023
Author

stevekuznetsov Sep 29, 2023
Maintainer

pgodowski Sep 29, 2023
Author

stevekuznetsov Sep 29, 2023
Maintainer

ncdc
Oct 25, 2023
Maintainer

ncdc
Jan 22, 2024
Maintainer

ncdc Jan 22, 2024
Maintainer

ncdc Jan 22, 2024
Maintainer

ncdc Jan 22, 2024
Maintainer

grokspawn Jan 22, 2024
Maintainer

ncdc Jan 22, 2024
Maintainer

pgodowski
Jan 24, 2024
Author

ncdc Jan 24, 2024
Maintainer

ncdc Jan 24, 2024
Maintainer

ncdc Jan 24, 2024
Maintainer

ncdc
Jan 24, 2024
Maintainer

ncdc Jan 26, 2024
Maintainer

stevekuznetsov Jan 29, 2024
Maintainer

stevekuznetsov Jan 29, 2024
Maintainer

stevekuznetsov Jan 29, 2024
Maintainer

pgodowski
Jan 30, 2024
Author

stevekuznetsov Jan 30, 2024
Maintainer

pgodowski Jan 30, 2024
Author

ncdc Jan 31, 2024
Maintainer

pgodowski Feb 5, 2024
Author

pgodowski
Jan 31, 2024
Author

stevekuznetsov Jan 31, 2024
Maintainer

ncdc Jan 31, 2024
Maintainer

pgodowski Jan 31, 2024
Author

pgodowski Feb 5, 2024
Author

everettraven Feb 5, 2024
Maintainer