Support CRIO runtime-based Kubernetes Clusters #418

veluruchaithanya · 2024-08-16T00:44:09Z

Hello, I'm looking to deploy Confidential Containers on Red Hat OpenShift to run Linux applications inside SGX and TDX secure enclaves. Unfortunately, I am unable to use the CoCo operator on Red Hat OpenShift, as OpenShift uses CRIO as the container runtime, which is not yet supported by Confidential Containers. Can anyone provide a rough time estimate for when the CoCo operator will support CRIO runtime-based Kubernetes clusters?

beraldoleal · 2024-08-16T12:07:59Z

Hi @veluruchaithanya, I know we allow CRI-O for guest-pull image pulling for instance, and the support was improved here #376.

Do you mind sharing the details or errors you are facing?

veluruchaithanya · 2024-08-16T22:20:05Z

Hi @beraldoleal, I have installed Red Hat Openshift on the bare-metal server. This is intel server and supports SGX and TDX features. I tried to install CoCo Operator on the Openshift platform using below commands and it is not creating kata runtime classes and also only one pod is up and running. The pods "cc-operator-daemon-install" and "cc-operator-pre-install-daemon" are not installed.

[admin@hp_server ~]$ # Openshift Version Information
[admin@hp_server ~]$ oc version
Client Version: 4.12.54
Kustomize Version: v4.5.7
Server Version: 4.14.23
Kubernetes Version: v1.27.13+401bb48
[admin@hp_server ~]$ 
[admin@hp_server ~]$ # Deploy CoCo Operator
[admin@hp_server ~]$ export RELEASE_VERSION="v0.9.0"
[admin@hp_server ~]$ oc apply -k "github.com/confidential-containers/operator/config/release?ref=${RELEASE_VERSION}"
namespace/confidential-containers-system created
customresourcedefinition.apiextensions.k8s.io/ccruntimes.confidentialcontainers.org created
serviceaccount/cc-operator-controller-manager created
role.rbac.authorization.k8s.io/cc-operator-leader-election-role created
clusterrole.rbac.authorization.k8s.io/cc-operator-manager-role created
clusterrole.rbac.authorization.k8s.io/cc-operator-metrics-reader created
clusterrole.rbac.authorization.k8s.io/cc-operator-proxy-role created
rolebinding.rbac.authorization.k8s.io/cc-operator-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/cc-operator-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/cc-operator-proxy-rolebinding created
configmap/cc-operator-manager-config created
service/cc-operator-controller-manager-metrics-service created
deployment.apps/cc-operator-controller-manager created
[admin@hp_server ~]$ 
[admin@hp_server ~]$ # Get POD status
[admin@hp_server ~]$ oc get pods -n confidential-containers-system
NAME                                             READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-bc47d97c5-gjfvb   2/2     Running   0          67s
[admin@hp_server ~]$ 
[admin@hp_server ~]$ # Check CRD is created
[admin@hp_server ~]$ oc get crd | grep ccruntime
ccruntimes.confidentialcontainers.org                             2024-08-16T21:52:43Z
[admin@hp_server ~]$ 
[admin@hp_server ~]$ # Create Custom Resource(CR)
[admin@hp_server ~]$ oc apply -k github.com/confidential-containers/operator/config/samples/ccruntime/default?ref=${RELEASE_VERSION}
ccruntime.confidentialcontainers.org/ccruntime-sample created
[admin@hp_server ~]$ 
[admin@hp_server ~]$ # Get POD status
[admin@hp_server ~]$ oc get pods -n confidential-containers-system
NAME                                             READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-bc47d97c5-gjfvb   2/2     Running   0          3m30s
[admin@hp_server ~]$ 
[admin@hp_server ~]$ # Get Runtime Class info
[admin@hp_server ~]$ oc get runtimeclass
No resources found
[admin@hp_server ~]$ 
[admin@hp_server~]$ # Get all resources in namespace confidential-containers-system
[admin@hp_server ~]$ oc get all -n confidential-containers-system
Warning: apps.openshift.io/v1 DeploymentConfig is deprecated in v4.14+, unavailable in v4.10000+
NAME                                                 READY   STATUS    RESTARTS   AGE
pod/cc-operator-controller-manager-bc47d97c5-gjfvb   2/2     Running   0          6m38s

NAME                                                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/cc-operator-controller-manager-metrics-service   ClusterIP   xxx.xx.xx.xxx   <none>        8443/TCP   6m39s

NAME                                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                            AGE
daemonset.apps/cc-operator-daemon-uninstall   0         0         0       0            0           katacontainers.io/kata-runtime=cleanup   3m31s

NAME                                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cc-operator-controller-manager   1/1     1            1           6m38s

NAME                                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/cc-operator-controller-manager-bc47d97c5   1         1         1       6m38s
[admin@hp_server ~]$ 
[admin@hp_server ~]$ 
[admin@hp_server ~]$ 
[admin@hp_server ~]$ # Get all the events in the namespace confidential-containers-system
[admin@hp_server ~]$ oc get events -n confidential-containers-system
LAST SEEN   TYPE     REASON              OBJECT                                                MESSAGE
10m         Normal   LeaderElection      lease/69bf4d38.confidentialcontainers.org             cc-operator-controller-manager-bc47d97c5-gjfvb_fa3d344e-2970-4472-b5aa-06cab6cce147 became leader
10m         Normal   Scheduled           pod/cc-operator-controller-manager-bc47d97c5-gjfvb    Successfully assigned confidential-containers-system/cc-operator-controller-manager-bc47d97c5-gjfvb to xxxxxxx
10m         Normal   AddedInterface      pod/cc-operator-controller-manager-bc47d97c5-gjfvb    Add eth0 [xx.xxx.x.xx/xx] from ovn-kubernetes
10m         Normal   Pulled              pod/cc-operator-controller-manager-bc47d97c5-gjfvb    Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1" already present on machine
10m         Normal   Created             pod/cc-operator-controller-manager-bc47d97c5-gjfvb    Created container kube-rbac-proxy
10m         Normal   Started             pod/cc-operator-controller-manager-bc47d97c5-gjfvb    Started container kube-rbac-proxy
10m         Normal   Pulled              pod/cc-operator-controller-manager-bc47d97c5-gjfvb    Container image "quay.io/confidential-containers/operator:v0.9.0" already present on machine
10m         Normal   Created             pod/cc-operator-controller-manager-bc47d97c5-gjfvb    Created container manager
10m         Normal   Started             pod/cc-operator-controller-manager-bc47d97c5-gjfvb    Started container manager
10m         Normal   SuccessfulCreate    replicaset/cc-operator-controller-manager-bc47d97c5   Created pod: cc-operator-controller-manager-bc47d97c5-gjfvb
10m         Normal   ScalingReplicaSet   deployment/cc-operator-controller-manager             Scaled up replica set cc-operator-controller-manager-bc47d97c5 to 1

I also see same issue in the below open ticket. Here it was mentioned that due to no CRIO support this issue cant be fixed.
#67

beraldoleal · 2024-08-20T18:40:44Z

@veluruchaithanya thank you.

Could you double check if the workernodes has the following label?

oc get nodes --show-labels | grep node.kubernetes.io\/worker

And also, the logs for manager would be great:

oc logs deployment/cc-operator-controller-manager -n confidential-containers-system -c manager

ldoktor · 2024-08-21T08:48:43Z

@c3d @bpradipt could you please elaborate a bit on why the deployment do not work on OCP and what would need to be done (either here on in the #67)? Pure kata-containers works well on OCP so it really is just a matter of the operator.

bpradipt · 2024-08-21T09:01:06Z

@c3d @bpradipt could you please elaborate a bit on why the deployment do not work on OCP and what would need to be done (either here on in the #67)? Pure kata-containers works well on OCP so it really is just a matter of the operator.

@ldoktor I have not tried latest CoCo operator with crio support on OCP. If you can try and share the details as mentioned by @beraldoleal in #418 (comment) it'll help.

ldoktor · 2024-08-21T12:04:42Z

I have with similar results as @veluruchaithanya. The worker nodes were not labelled node.kubernetes.io/worker out of the box but node-role.kubernetes.io/worker=. After labeling one node:

oc label node $WORKER_NODE node.kubernetes.io/worker=

things started happening and I got daemonset.apps/cc-operator-pre-install-daemon which failed to start due to security constraints. Using a hammer (don't use this in production):

oc adm policy add-scc-to-group privileged system:authenticated system:serviceaccounts
oc adm policy add-scc-to-group anyuid system:authenticated system:serviceaccounts
oc label --overwrite ns default pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline

I got pod/cc-operator-pre-install-daemon-l5zb5 which ended up in CrashLoopBackOff because of:

[medic@fedora release ]$ oc -n confidential-containers-system logs daemonset.apps/cc-operator-pre-install-daemon 
INSTALL_COCO_CONTAINERD: false
INSTALL_OFFICIAL_CONTAINERD: false
INSTALL_VFIO_GPU_CONTAINERD: false
INSTALL_NYDUS_SNAPSHOTTER: true
Restarting cri-o
Failed to restart cri-o.service: Unit cri-o.service not found.

Looking at the services it's called crio.service on ocp-4.14.

The manager logs (from the run without security constrains):
cc_system_manager.log

but I guess the main issue is the cri-o service name and the node labeling...

ldoktor · 2024-08-21T12:33:52Z

OK I went ahead and tried:

diff --git a/install/pre-install-payload/scripts/reqs-deploy.sh b/install/pre-install-payload/scripts/reqs-deploy.sh
index 8ae05f0..a497f49 100755
--- a/install/pre-install-payload/scripts/reqs-deploy.sh
+++ b/install/pre-install-payload/scripts/reqs-deploy.sh
@@ -167,8 +167,13 @@ function uninstall_artifacts() {
 
 function restart_systemd_service() {
        host_systemctl daemon-reload
-       echo "Restarting ${container_engine}"
-       host_systemctl restart "${container_engine}"
+       if [ "${container_engine}" == "cri-o" ]; then
+               service_name="crio"
+       else
+               service_name="${container_engine}"
+       fi
+       echo "Restarting ${service_name}"
+       host_systemctl restart "${service_name}"
 }
 
 function configure_nydus_snapshotter_for_containerd() {

pushed it here quay.io/ldoktor/reqs-payload:x86_64-3cdc15843a7b7e6eef7f7170832fe4eab75b6995 (in case you want to test it) and now I'm getting the runtimeclasses:

$ oc get runtimeclass
NAME            HANDLER         AGE
kata            kata-qemu       37s
kata-clh        kata-clh        38s
kata-qemu       kata-qemu       38s
kata-qemu-sev   kata-qemu-sev   38s
kata-qemu-snp   kata-qemu-snp   37s
kata-qemu-tdx   kata-qemu-tdx   38s

And all the expected resources:

NAME                                                  READY   STATUS    RESTARTS   AGE
pod/cc-operator-controller-manager-7b67459687-4kxsw   2/2     Running   0          45m
pod/cc-operator-daemon-install-pcxv9                  1/1     Running   0          2m32s
pod/cc-operator-pre-install-daemon-bbwhj              1/1     Running   0          2m41s

NAME                                                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/cc-operator-controller-manager-metrics-service   ClusterIP   172.30.37.180   <none>        8443/TCP   45m

NAME                                            DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                            AGE
daemonset.apps/cc-operator-daemon-install       1         1         1       1            1           node.kubernetes.io/worker=               2m33s
daemonset.apps/cc-operator-daemon-uninstall     0         0         0       0            0           katacontainers.io/kata-runtime=cleanup   45m
daemonset.apps/cc-operator-pre-install-daemon   1         1         1       1            1           node.kubernetes.io/worker=               45m

NAME                                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cc-operator-controller-manager   1/1     1            1           45m

NAME                                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/cc-operator-controller-manager-7b67459687   1         1         1       45m

@bpradipt do you think such change would be acceptable or is it too hackish? Another story will be the security constrains but IIRC it should be only a matter of creating the right labels, shouldn't it?

bpradipt · 2024-08-21T13:33:42Z

Nice work @ldoktor .. The reqs-deploy.sh changes looks ok to me.
As for the node label and security constraints, we can document this for start. Wdyt ?

ldoktor · 2024-08-21T14:23:57Z

The trick I used is used in e2e testing but it's explicitly marked as DO-NOT-USE-IN-PRODUCTION as it effectively disables the pod security framework. Anyway let me send the reqs-deploy fix and we can experiment with a proper fix of the permissions later.

the cri-o runtime uses "crio.service" service name (tested on OCP 4.14) Related to: confidential-containers#418 Signed-off-by: Lukáš Doktor <[email protected]>

ldoktor · 2024-08-22T09:45:47Z

Well the operator already creates a service account so we just need to allow the service account to use privileged operations by oc adm policy add-scc-to-user privileged -z cc-operator-controller-manager -n confidential-containers-system. With that and using the quay.io/ldoktor/reqs-payload:x86_64-3cdc15843a7b7e6eef7f7170832fe4eab75b6995 and you should get your runtimeclasses installed.

If you try using it, you get: default 6m32s Warning FailedCreatePodSandBox pod/http-server Failed to create pod sandbox: rpc error: code = Unknown desc = CreateContainer failed: fork/exec /opt/kata/libexec/virtiofsd: permission denied: unknown

This is because one needs to tell selinux that /opt/kata contains runeable binaries. This is pending on kata-containers: kata-containers/kata-containers#8417 and is treated by CI temporarily here https://github.com/kata-containers/kata-containers/blob/main/ci/openshift-ci/cluster/deployments/relabel_selinux.yaml

To use it with the operator that deploys things inside confidential-containers-system namespaces you can use a similar approach just changing the namespaces and service accounts:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: relabel-selinux-daemonset
  namespace: confidential-containers-system
spec:
  selector:
    matchLabels:
      app: restorecon
  template:
    metadata:
      labels:
        app: restorecon
    spec:
      serviceAccountName: cc-operator-controller-manager
      hostPID: true
      containers:
        - name: relabel-selinux-container
          image: alpine
          securityContext:
            privileged: true
          command: ["/bin/sh", "-c", "
            set -e;
            echo Starting the relabel;
            nsenter --target 1 --mount bash -xc '
                command -v semanage &>/dev/null || { echo Does not look like a SELINUX cluster, skipping; exit 0; };
                for ENTRY in \
                    \"/(.*/)?opt/kata/bin(/.*)?\" \
                    \"/(.*/)?opt/kata/runtime-rs/bin(/.*)?\" \
                    \"/(.*/)?opt/kata/share/kata-.*(/.*)?(/.*)?\" \
                    \"/(.*/)?opt/kata/share/ovmf(/.*)?\" \
                    \"/(.*/)?opt/kata/share/tdvf(/.*)?\" \
                    \"/(.*/)?opt/kata/libexec(/.*)?\";
                do
                    semanage fcontext -a -t qemu_exec_t \"$ENTRY\" || semanage fcontext -m -t qemu_exec_t \"$ENTRY\" ||
 { echo \"Error in semanage command\"; exit 1; }
                done;
                restorecon -v -R /opt/kata || { echo \"Error in restorecon command\"; exit 1; }
            ';
            echo NSENTER_FINISHED_WITH: $?;
            sleep infinity"]

Which those changes I can successfully run a sample kata-qemu pod:

# Copyright (c) 2020 Red Hat, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# Define the pod for a http server app.
---
apiVersion: v1
kind: Pod
metadata:
  name: http-server
  labels:
    app: http-server-app
spec:
  containers:
    - name: http-server
      image: registry.fedoraproject.org/fedora
      ports:
        - containerPort: 8080
      command: ["python3"]
      args: [ "-m", "http.server", "8080"]
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
            - ALL
        runAsNonRoot: true
        runAsUser: 1000
        seccompProfile:
          type: RuntimeDefault
  runtimeClassName: kata-qemu

(tested on ocp 4.16 today)

Now @bpradipt I know how to make things working but I'm not sure how to add this to operator to make it work out of the box (apart from adding this to the documentation). Do you know if we want to add this and how to do it?

bpradipt · 2024-08-22T09:55:00Z

@ldoktor can kata-deploy handle the selinux relabling when installing on systems with selinux enabled?

ldoktor · 2024-08-22T10:37:38Z

@ldoktor can kata-deploy handle the selinux relabling when installing on systems with selinux enabled?

That is what is being discussed in kata-containers/kata-containers#8417 (also note that this has nothing to do with the operator, it fails with kata-containers only as well so it needs to be addressed there)

the cri-o runtime naming is inconsistent and for service uses "crio.service" name (without the "-"). To be consistent with kata-containers override the container_engine variable to "crio" and update all places where it's being used. Related to: confidential-containers#418 Signed-off-by: Lukáš Doktor <[email protected]>

the cri-o runtime naming is inconsistent and for service uses "crio.service" name (without the "-"). To be consistent with kata-containers override the container_engine variable to "crio" and update all places where it's being used. Related to: #418 Signed-off-by: Lukáš Doktor <[email protected]>

ldoktor mentioned this issue Aug 21, 2024

pre-install: Fix service name for cri-o runtime #424

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support CRIO runtime-based Kubernetes Clusters #418

Support CRIO runtime-based Kubernetes Clusters #418

veluruchaithanya commented Aug 16, 2024

beraldoleal commented Aug 16, 2024

veluruchaithanya commented Aug 16, 2024

beraldoleal commented Aug 20, 2024

ldoktor commented Aug 21, 2024

bpradipt commented Aug 21, 2024

ldoktor commented Aug 21, 2024

ldoktor commented Aug 21, 2024 •

edited

Loading

bpradipt commented Aug 21, 2024

ldoktor commented Aug 21, 2024

ldoktor commented Aug 22, 2024 •

edited

Loading

bpradipt commented Aug 22, 2024

ldoktor commented Aug 22, 2024 •

edited

Loading

Support CRIO runtime-based Kubernetes Clusters #418

Support CRIO runtime-based Kubernetes Clusters #418

Comments

veluruchaithanya commented Aug 16, 2024

beraldoleal commented Aug 16, 2024

veluruchaithanya commented Aug 16, 2024

beraldoleal commented Aug 20, 2024

ldoktor commented Aug 21, 2024

bpradipt commented Aug 21, 2024

ldoktor commented Aug 21, 2024

ldoktor commented Aug 21, 2024 • edited Loading

bpradipt commented Aug 21, 2024

ldoktor commented Aug 21, 2024

ldoktor commented Aug 22, 2024 • edited Loading

bpradipt commented Aug 22, 2024

ldoktor commented Aug 22, 2024 • edited Loading

ldoktor commented Aug 21, 2024 •

edited

Loading

ldoktor commented Aug 22, 2024 •

edited

Loading

ldoktor commented Aug 22, 2024 •

edited

Loading