-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding EKS role annotation to Master ServiceAccount causes Jenkins pod restart loop #361
Comments
Our I can copy the Pod spec the operator creates, change the name, delete the It has to be the operator killing the pods because it thinks they are incorrectly configured. See in the logs that 0.2 seconds after "Creating a new Jenkins Master Pod" there is "Jenkins master pod restarted by operator". EDIT:Fixed now, and issue updated. Figuring this out was made much harder because of #362, which prevented changes to the Jenkins resource from fixing the problem. To fix the problem, you have to delete the Jenkins resource and re-create it without the offending annotation. |
Hi @Nuru, I think adding Cheers |
@tomaszsek Yes, that is probably right, EKS adds the aws-iam-token, but the operator should say something more useful in the logs, or give up because of the sync failure. I started on this mess because the operator gave up trying to modify the ServiceAccount after 10 failures, so I expected to see something like that in the logs here. I spent all day trying to figure out why the volume mounts were failing. Turns out it was because the operator was killing the pod before the mounts finished. As you asked, here is the pod spec from the operator. I have redacted a few numbers that should not make a difference to you. Pod.yaml (click to see)kind: Pod
apiVersion: v1
metadata:
name: jenkins-jenkins
namespace: jenkins
selfLink: /api/v1/namespaces/jenkins/pods/jenkins-jenkins
uid: a6534b92-c435-470f-bcd0-c6cc3916fa51
resourceVersion: '14798790'
creationTimestamp: '2020-05-08T06:07:25Z'
deletionTimestamp: '2020-05-08T06:07:55Z'
deletionGracePeriodSeconds: 30
labels:
app: jenkins-operator
jenkins-cr: jenkins
annotations:
kubernetes.io/psp: eks.privileged
ownerReferences:
- apiVersion: jenkins.io/v1alpha2
kind: Jenkins
name: jenkins
uid: 7a8f1ee9-10d7-41f1-a3b7-49d7a0dc6b07
controller: true
blockOwnerDeletion: true
spec:
volumes:
- name: aws-iam-token
projected:
sources:
- serviceAccountToken:
audience: sts.amazonaws.com
expirationSeconds: 86400
path: token
defaultMode: 420
- name: jenkins-home
emptyDir: {}
- name: scripts
configMap:
name: jenkins-operator-scripts-jenkins
defaultMode: 511
- name: init-configuration
configMap:
name: jenkins-operator-init-configuration-jenkins
defaultMode: 420
- name: operator-credentials
secret:
secretName: jenkins-operator-credentials-jenkins
defaultMode: 420
- name: custom-css
configMap:
name: custom-css
defaultMode: 420
- name: backup
persistentVolumeClaim:
claimName: jenkins-backup
- name: jenkins-operator-jenkins-token-8hsq4
secret:
secretName: jenkins-operator-jenkins-token-8hsq4
defaultMode: 420
containers:
- name: jenkins-master
image: 'jenkinsci/blueocean:1.23.1'
command:
- bash
- '-c'
- >-
/var/jenkins/scripts/init.sh && exec /sbin/tini -s --
/usr/local/bin/jenkins.sh
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: slavelistener
containerPort: 50000
protocol: TCP
env:
- name: COPY_REFERENCE_FILE_LOG
value: /var/lib/jenkins/copy_reference_file.log
- name: JAVA_OPTS
value: >-
-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap
-XX:MaxRAMFraction=1 -Djenkins.install.runSetupWizard=false
-Djava.awt.headless=true
- name: JENKINS_HOME
value: /var/lib/jenkins
- name: AWS_ROLE_ARN
value: >-
arn:aws:iam::ACCOUNT:role/eks-jenkins-operator-jenkins@jenkins
- name: AWS_WEB_IDENTITY_TOKEN_FILE
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
resources:
limits:
cpu: 1500m
memory: 3Gi
requests:
cpu: 500m
memory: 500Mi
volumeMounts:
- name: jenkins-home
mountPath: /var/lib/jenkins
- name: scripts
readOnly: true
mountPath: /var/jenkins/scripts
- name: init-configuration
readOnly: true
mountPath: /var/jenkins/init-configuration
- name: operator-credentials
readOnly: true
mountPath: /var/jenkins/operator-credentials
- name: custom-css
mountPath: /var/jenkins/jenkins/userContent/custom-css
- name: jenkins-operator-jenkins-token-8hsq4
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
- name: aws-iam-token
readOnly: true
mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
livenessProbe:
httpGet:
path: /login
port: http
scheme: HTTP
initialDelaySeconds: 80
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 12
readinessProbe:
httpGet:
path: /login
port: http
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
- name: backup
image: 'virtuslab/jenkins-operator-backup-pvc:v0.0.8'
env:
- name: BACKUP_COUNT
value: '9'
- name: BACKUP_DIR
value: /backup
- name: JENKINS_HOME
value: /jenkins-home
- name: AWS_ROLE_ARN
value: >-
arn:aws:iam::ACCOUNT:role/eks-jenkins-operator-jenkins@jenkins
- name: AWS_WEB_IDENTITY_TOKEN_FILE
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
resources:
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 50m
memory: 50Mi
volumeMounts:
- name: jenkins-home
mountPath: /jenkins-home
- name: backup
mountPath: /backup
- name: jenkins-operator-jenkins-token-8hsq4
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
- name: aws-iam-token
readOnly: true
mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
restartPolicy: Never
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: jenkins-operator-jenkins
serviceAccount: jenkins-operator-jenkins
nodeName: REDACTED.us-east-2.compute.internal
securityContext: {}
schedulerName: default-scheduler
tolerations:
- key: node.kubernetes.io/not-ready
operator: Exists
effect: NoExecute
tolerationSeconds: 300
- key: node.kubernetes.io/unreachable
operator: Exists
effect: NoExecute
tolerationSeconds: 300
priority: 0
enableServiceLinks: true
status:
phase: Pending
conditions:
- type: Initialized
status: 'True'
lastProbeTime: null
lastTransitionTime: '2020-05-08T06:07:25Z'
- type: Ready
status: 'False'
lastProbeTime: null
lastTransitionTime: '2020-05-08T06:07:25Z'
reason: ContainersNotReady
message: 'containers with unready status: [jenkins-master backup]'
- type: ContainersReady
status: 'False'
lastProbeTime: null
lastTransitionTime: '2020-05-08T06:07:25Z'
reason: ContainersNotReady
message: 'containers with unready status: [jenkins-master backup]'
- type: PodScheduled
status: 'True'
lastProbeTime: null
lastTransitionTime: '2020-05-08T06:07:25Z'
hostIP: REDACTED
startTime: '2020-05-08T06:07:25Z'
containerStatuses:
- name: backup
state:
terminated:
exitCode: 0
startedAt: null
finishedAt: null
lastState: {}
ready: false
restartCount: 0
image: 'virtuslab/jenkins-operator-backup-pvc:v0.0.8'
imageID: ''
- name: jenkins-master
state:
terminated:
exitCode: 0
startedAt: null
finishedAt: null
lastState: {}
ready: false
restartCount: 0
image: 'jenkinsci/blueocean:1.23.1'
imageID: ''
qosClass: Burstable |
That I was thinking. The new volume has been added by the EKS:
and containers envs:
We will make a fix for that. We are working on moving to deployment instead of pod #195. After #195 is complete this kind of errors should never appear. |
Thank you for getting this fixed. I understand why the operator rejects the changes EKS made, and I expect you are right that switching to a Deployment will provide a clean fix. Do you have an ETA for fixing this? It is a blocker for us. |
Any thoughts on timing here @tomaszsek? This is becoming a blocker. |
Is there any update on the fix for this issue. ? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this issue is still affecting you, just comment with any updates and we'll keep it open. Thank you for your contributions. |
It is possible to workaround this issue, volumes and env variables that were added by EKS must be added to CR. It is not very elegant, but it works. |
is there any progress on that? this is a kind of blocker for those, who wants to develop, e.g. a custom backup to AWS, using proper IAM based solution. |
Hi! Thanks in advance! |
Hello! I was wondering if there is some update here, I'm still having issues to integrate this with jenkins-operator https://eksctl.io/usage/iamserviceaccounts/ |
Workaround is to add apiVersion: jenkins.io/v1alpha2
kind: Jenkins
metadata:
name: master
annotations:
"jenkins.io/use-deployment": 'true' this will create Deployment between Jenkins CR and Pod. I found it here |
I tried this, but Configuration as Code does not work any more and the seed job worker cannot connect. |
Expected Behavior
On Amazon EKS, adding
eks.amazonaws.com/role-arn
annotation to Jenkinsspec.serviceAccount.annotations
would allow ServiceAccount to assume AWS IAM role.Actual Behavior
Simply adding the annotation to an existing Jenkins resource fails because of #362. Creating a Jenkins resource from scratch containing the annotation fails with a restart loop:
Log output from operator, trimmed for brevity:
Steps to Reproduce the Problem
The text was updated successfully, but these errors were encountered: