You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had a few thoughts while working on and reviewing things related to quotas:
The update_users_cpu_quota function, and other quota functions that rely on checking of env variables inside, have non-obvious behavior. For example, update_users_cpu_quota has a verb update_... which implies it will update CPU quotas but sometimes it will not if the combination of env variables is not right.
The WORKFLOW_TERMINATION_QUOTA_UPDATE_POLICY env variable is used not only to configure what quotas are calculated when the workflow is finished but also affects whatever quotas are calculated in the REST API endpoint in r-workflow-controller during file upload or deletion (line). And this is not obvious from the name, TERMINATION doesn't imply it will stop calculating quotas during upload/deletion. From the user/admin perspective, maybe, not much to worry about but for a developer it is confusing.
The above changes were introduced because we wanted to quickly disable disk quotas to benchmark cluster (and I was part of it) but I am not sure it is a good long-term solution.
I don't have any concrete suggestions on how to make quotas logic better (except, maybe, moving ENV variables out from update_... functions). Would be good to hear your opinion on this + maybe you have other comments related to quota logic.
The text was updated successfully, but these errors were encountered:
Another thing is that when the workflow is finished/failed, and quotas are recalculated inside sqlalchemy hook (code). Which is probably a wrong decision because we cannot easily re-try or delay recalculation.
A better way might be to handle quotas recalculations in a separate consumer asynchronously with a dedicated queue.
Originated in #156 (review)
I had a few thoughts while working on and reviewing things related to quotas:
The
update_users_cpu_quota
function, and other quota functions that rely on checking of env variables inside, have non-obvious behavior. For example,update_users_cpu_quota
has a verbupdate_...
which implies it will update CPU quotas but sometimes it will not if the combination of env variables is not right.The
WORKFLOW_TERMINATION_QUOTA_UPDATE_POLICY
env variable is used not only to configure what quotas are calculated when the workflow is finished but also affects whatever quotas are calculated in the REST API endpoint inr-workflow-controller
during file upload or deletion (line). And this is not obvious from the name,TERMINATION
doesn't imply it will stop calculating quotas during upload/deletion. From the user/admin perspective, maybe, not much to worry about but for a developer it is confusing.The above changes were introduced because we wanted to quickly disable disk quotas to benchmark cluster (and I was part of it) but I am not sure it is a good long-term solution.
I don't have any concrete suggestions on how to make quotas logic better (except, maybe, moving ENV variables out from
update_...
functions). Would be good to hear your opinion on this + maybe you have other comments related to quota logic.The text was updated successfully, but these errors were encountered: