-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating requiredResources in Application Management API #280
Changes from 3 commits
891a083
b291c76
d6fbf60
bc455b1
c276faf
1f3826a
f79413b
c76ff77
5312a84
1218954
bcd57bc
6b2b6c8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -897,13 +897,19 @@ components: | |
and the value of the Edge Cloud Provider | ||
object. This value is used to identify an Edge Cloud zone | ||
between Edge Clouds from different Edge Cloud Providers. | ||
required: | ||
- edgeCloudZoneId | ||
- edgeCloudZoneName | ||
- edgeCloudProvider | ||
properties: | ||
edgeCloudZoneId: | ||
$ref: '#/components/schemas/EdgeCloudZoneId' | ||
edgeCloudZoneName: | ||
$ref: '#/components/schemas/EdgeCloudZoneName' | ||
edgeCloudZoneStatus: | ||
$ref: '#/components/schemas/EdgeCloudZoneStatus' | ||
edgeCloudZoneFlavors: | ||
$ref: '#/components/schemas/EdgeCloudZoneFlavors' | ||
edgeCloudProvider: | ||
$ref: '#/components/schemas/EdgeCloudProvider' | ||
edgeCloudRegion: | ||
|
@@ -925,6 +931,14 @@ components: | |
- unknown | ||
default: unknown | ||
|
||
EdgeCloudZoneFlavors: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is another parameter "Flavor" below in yaml. How does that correlates with EdgeCloudZoneFlavors? |
||
description: List of unique Name IDs of Infrastructure Flavors | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How should a flavor be visualized from a developer point of view. Is a flavor represent a virtual machine (VM) or a server node with given set of resources mapped to it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If say ir represent a single node then gpuMemory attribute alone may not be sufficient to allocate such a resource. There may be attributes like gpuCount, gpuFamily etc are also to be considered to meet the workload requirements which need GPU. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @gunjald, given that flavors introduce complexity to the API, I'll implement a change that removes them. This will provide greater flexibility for operators to allocate workloads of any size. |
||
type: array | ||
items: | ||
type: string | ||
description: Flavor ID | ||
example: A1.2C2M.GPU8G | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we decide to use flavors, I would suggest that each flavor have the list and spec of all resources they provide. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I Agree Nicola, good point. I'll convert the string to object with the spec information. Thanks! |
||
ErrorInfo: | ||
type: object | ||
description: Information about the error | ||
|
@@ -961,6 +975,235 @@ components: | |
type: integer | ||
description: Number of GPUs | ||
|
||
Flavor: | ||
type: string | ||
description: | | ||
Preset configuration for compute, memory, GPU, | ||
and storage capacity. (i.e - A1.2C4M.GPU8G, A1.2C4M.GPU16G, A1.4C8M,..) | ||
example: A1.2C2M.GPU8G | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should add a GET API to get a list of flavors so that user knows what the possible flavor names to use are. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also tends to agree :-) May be with GET /edge-cloud-zones we can add query parameters to retrieve list of flavors and then in future we can also extend other resources via query parameters. Just a suggestion though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree, I'll add an entry in GET /edge-cloud-zones to report the flavors. |
||
|
||
NodePool: | ||
description: | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general an issue i see with this approach is that it offers too many choices to the Application Developer to ask for the compute it needs. There can be too many possibilities that the developers can provide in the API and platform needs to find out from where it can serve too many diverse combination of resources or clusters. Also, as a developer I may need to run multiple applications on same cluster so how can I express it here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The approach is to adopt a one-application-to-one-infrastructure-resource approach (VM, Kubernetes cluster, container, Docker Compose). This means we avoid managing infrastructure independently of the application. For running multiple applications on the same Kubernetes cluster, Helm packages provide a way to bundle them together. A Helm package can contain multiple application charts, such as a database and a web application chart, effectively treating them as a single application for deployment. This approach aligns well with node pools. Developers can leverage node pools to create clusters with a mix of nodes, such as having one with a GPU and others without, optimizing resource allocation. The application to node pool mapping is done through labels, allowing developers to reference them in Helm chart values for node affinity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks to be very resource heavy approach by one app to one type of infra like a k8s cluster unless we provide a way to enable in some way deploy multiple applications on one cluster. Also what will happen if cluster creation fails? That means application onboarding failed as both are now one atomic package. And another issue could be once an app along with its given infra accepted I cannot change the infra e.g. reduce or increase the resources if needed. So I still think specially with cluster type of infra that it will be hard to implement which could mean creating a cluster dynamically which could be a very time consuming process. If we delink infra creation then there could be options like platform offline creates cluster and provide API to retrieve details of cluster ID or even provide infra creation API to manage infra for applications and use the information with the App LCM API to link them together. Means there could be ways but otherwise in terms of approach it seems to be tightly couple the infra and applications and may reduce reusability. May be more inputs will help here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @gunjald, Sounds good. I think it would be interesting to discuss the creation of an API to manage the infrastructure lifecycle (Create, Update, Delete). Enabling the Kubernetes cluster reference within the Application Management API would be easy. For now, I think it's safe to keep things this way, allowing developers to use a Kubernetes cluster and define the minimum configuration details required by their application. We can then open a discussion about how to design a more comprehensive API for infrastructure management resources. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think that this should be discussed further as it changes how some of use see the problem we are trying to solve. While it makes sense for VM and containers, I'm not sure for k8s clusters. It's my understanding that operators want to use the same infra for multiple app providers/app types. In this case, packaging multiple apps in the same Helm Chart, as suggested above, cannot be done. |
||
Set of worker nodes in a Kubernetes cluster. | ||
type: object | ||
required: | ||
- flavor | ||
- numNodes | ||
properties: | ||
name: | ||
type: string | ||
example: nodepool1 | ||
description: | | ||
Nodepool Name (Autogenerated if not provided in the request) | ||
flavor: | ||
$ref: '#/components/schemas/Flavor' | ||
numNodes: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldnt it be something like numFlavors for better correlation? |
||
type: integer | ||
example: 1 | ||
description: Number of workers that compose the node pool. | ||
|
||
K8sAddons: | ||
description: | | ||
Addons for the Kubernetes cluster. | ||
Additional addons should be defined in application the helm chart | ||
(Service Mesh, Serverless, AI). | ||
type: object | ||
properties: | ||
monitoring: | ||
type: boolean | ||
example: true | ||
default: false | ||
description: Enable monitoring for Kubernetes cluster. | ||
ingress: | ||
type: boolean | ||
example: true | ||
default: false | ||
description: Enable ingress for Kubernetes cluster. | ||
|
||
K8sNetworking: | ||
description: | | ||
Kubernetes networking definition | ||
type: object | ||
properties: | ||
primaryNetwork: | ||
description: Definition of Kubernetes primary Network | ||
type: object | ||
properties: | ||
provider: | ||
description: CNI provider name | ||
type: string | ||
example: cilium | ||
version: | ||
description: CNI provider version | ||
type: string | ||
example: "1.13" | ||
additionalNetworks: | ||
description: Additional Networks for the Kubernetes cluster. | ||
type: array | ||
items: | ||
type: object | ||
description: Additional network interface definition | ||
properties: | ||
name: | ||
description: Additional Network Name | ||
type: string | ||
example: net1 | ||
interfaceType: | ||
description: | | ||
Type of additional Interface: | ||
netdevice: (SR-IOV) A regular kernel network device in the | ||
Network Namespace (netns) of the container | ||
vfio-pci: (SR-IOV) A PCI network interface directly mounted | ||
in the container | ||
interface: Additional interface to be used by cni plugins | ||
such as macvlan, ipvlan | ||
Note: The use of SR-IOV interfaces automatically | ||
configure the required kernel parameters for the nodes. | ||
type: string | ||
example: vfio-pci | ||
enum: | ||
- netdevice | ||
- vfio-pci | ||
- interface | ||
|
||
AdditionalStorage: | ||
description: Additional storage for the application. | ||
type: array | ||
items: | ||
type: object | ||
required: | ||
- storageSize | ||
- mountPoint | ||
properties: | ||
name: | ||
type: string | ||
description: Name of additional storage resource. | ||
example: logs | ||
storageSize: | ||
type: string | ||
description: Additional persistent volume for the application. | ||
example: 80GB | ||
pattern: ^\d+(GB|MB)$ | ||
mountPoint: | ||
type: string | ||
description: Location of additional storage resource. | ||
example: /logs | ||
|
||
Vcpu: | ||
type: string | ||
pattern: ^\d+((\.\d{1,3})|(m))?$ | ||
description: | | ||
Number of vcpus in whole (i.e 1), decimal (i.e 0.500) up to | ||
millivcpu, or millivcpu (i.e 500m) format. | ||
example: "500m" | ||
|
||
KubernetesResources: | ||
description: Definition of Kubernetes Cluster Infrastructure. | ||
type: object | ||
required: | ||
- nodePools | ||
- infraKind | ||
properties: | ||
infraKind: | ||
description: Type of infrastructure for the application. | ||
type: string | ||
example: kubernetes | ||
enum: | ||
- kubernetes | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The infrakind is part of the top level attribute KubernetesResources and looks redundant with value as "kubernetes" as KubernetesResources itself indicate that it is kubernetes resource. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is how discriminators work in OpenAPI: https://swagger.io/docs/specification/v3_0/data-models/inheritance-and-polymorphism/ |
||
version: | ||
type: string | ||
description: Minimum Kubernetes Version. | ||
example: "1.29" | ||
controlPlaneHa: | ||
type: boolean | ||
description: | | ||
True: Enable High avaliability of Kubernetes | ||
control plane (3 nodes) | ||
False: Disable High avaliability of Kubernetes | ||
control plane (1 node) | ||
default: false | ||
nodePools: | ||
type: array | ||
description: | | ||
Description of worker node set in a Kubernetes cluster. | ||
items: | ||
$ref: '#/components/schemas/NodePool' | ||
additionalStorage: | ||
type: string | ||
description: | | ||
Amount of persistent storage allocated to the Kubernetes PVC. | ||
example: 80GB | ||
pattern: ^\d+(GB|MB)$ | ||
networking: | ||
$ref: '#/components/schemas/K8sNetworking' | ||
addons: | ||
$ref: '#/components/schemas/K8sAddons' | ||
|
||
VmResources: | ||
description: Definition of Virtual Machine Infrastructure | ||
type: object | ||
required: | ||
- flavor | ||
- infraKind | ||
properties: | ||
infraKind: | ||
description: Type of infrastructure for the application. | ||
type: string | ||
example: virtualMachine | ||
enum: | ||
- virtualMachine | ||
flavor: | ||
$ref: '#/components/schemas/Flavor' | ||
additionalStorages: | ||
$ref: '#/components/schemas/AdditionalStorage' | ||
|
||
DockerComposeResources: | ||
description: Definition of Docker Compose Infrastructure | ||
type: object | ||
required: | ||
- flavor | ||
- infraKind | ||
properties: | ||
infraKind: | ||
description: Type of infrastructure for the application. | ||
type: string | ||
example: dockerCompose | ||
enum: | ||
- dockerCompose | ||
flavor: | ||
$ref: '#/components/schemas/Flavor' | ||
additionalStorages: | ||
$ref: '#/components/schemas/AdditionalStorage' | ||
|
||
ContainerResources: | ||
description: Container Infrastructure Definition | ||
type: object | ||
required: | ||
- numCPU | ||
- memory | ||
- storage | ||
- infraKind | ||
properties: | ||
infraKind: | ||
description: Type of infrastructure for the application. | ||
type: string | ||
example: container | ||
enum: | ||
- container | ||
numCPU: | ||
$ref: '#/components/schemas/Vcpu' | ||
memory: | ||
type: integer | ||
example: 10 | ||
description: Memory in giga bytes | ||
storage: | ||
$ref: '#/components/schemas/AdditionalStorage' | ||
gpu: | ||
type: array | ||
description: Number of GPUs | ||
items: | ||
$ref: '#/components/schemas/GpuInfo' | ||
|
||
Ipv4Addr: | ||
type: string | ||
format: ipv4 | ||
|
@@ -1024,33 +1267,23 @@ components: | |
type: integer | ||
description: Port to stablish the connection | ||
minimum: 0 | ||
|
||
RequiredResources: | ||
description: | | ||
Fundamental hardware requirements to be provisioned by the | ||
Application Provider. | ||
type: object | ||
required: | ||
- numCPU | ||
- memory | ||
- storage | ||
properties: | ||
numCPU: | ||
type: integer | ||
description: Number of virtual CPUs | ||
example: 1 | ||
memory: | ||
type: integer | ||
example: 10 | ||
description: Memory in giga bytes | ||
storage: | ||
type: integer | ||
example: 60 | ||
description: Storage in giga bytes | ||
gpu: | ||
type: array | ||
description: Number of GPUs | ||
items: | ||
$ref: '#/components/schemas/GpuInfo' | ||
oneOf: | ||
- $ref: "#/components/schemas/KubernetesResources" | ||
- $ref: "#/components/schemas/VmResources" | ||
- $ref: "#/components/schemas/ContainerResources" | ||
- $ref: "#/components/schemas/DockerComposeResources" | ||
discriminator: | ||
propertyName: infraKind | ||
mapping: | ||
kubernetes: "#/components/schemas/KubernetesResources" | ||
virtualMachine: "#/components/schemas/VmResources" | ||
container: "#/components/schemas/ContainerResources" | ||
dockerCompose: "#/components/schemas/DockerComposeResources" | ||
|
||
SubmittedApp: | ||
description: Information about the submitted app | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is uniqueness defined here? Is edgeCloudZoneId or edgeCloudZoneName are unique or their combination is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SimpleEdgeDiscovery has:
edgeCloudZoneId
is a UUID for the Edge Cloud Zone.edgeCloudZoneName
is the common name of the closest Edge Cloud Zone tothe user device.
edgeCloudProvider
is the name of the operator or cloud provider ofthe Edge Cloud Zone.
So,
edgeCloudZoneId
is expected to be unique, e.g. a namespaced URNThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to point out that if edgeCloudZoneName (or edgeCloudZoneName + edgeCloudProvider) cannot uniquely identify the zone, i.e. only the UUID can uniquely identify the zone, then we won't be able to support a declarative API. That's probably ok if the API is mainly going to be accessed via a GUI by a human, but if we want to support automation and infra-as-code via yaml files, it would be much nicer to be able to have a declarative API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the edgeCloudZoneId be the only required parameter for the edgeCloudZone?