-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QAT: update readme for QAT #110
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions. I also noticed https://github.com/intel/intel-technology-enabling-for-openshift/blob/main/device_plugins/qat_device_plugin.yaml is not quite OK. maxNumDevices
should be 128 and perhaps c6xxxvf
can be dropped.
device_plugins/deploy_qat.md
Outdated
|
||
## Prerequisite | ||
Before deploying the QAT plugin, make sure iommu MCO is deployed and iommu is enabled. | ||
By default, half of the virtual functions are used for compression and the other half is used for cryptography. This default behaviour can be customised via a config map which configures the hardware using sysfs. The config values could be `asym`, `sym` and `dc`, for more details see https://github.com/torvalds/linux/blob/master/Documentation/ABI/testing/sysfs-driver-qat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pointing to the latest sysfs-driver-qat
is not accurate because the Linux 6.5 features are not available. Only sym;asym
and dc
should be valid on OCP today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
device_plugins/deploy_qat.md
Outdated
``` | ||
oc create configmap --namespace=openshift-operators --from-literal "qat.conf=ServicesEnabled=<option>" qat-config | ||
``` | ||
The option can be any combination of values described above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my previous comment. this is also missing how to setup "per node" settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
device_plugins/deploy_qat.md
Outdated
``` | ||
|
||
## verify QAT resource is available | ||
QAT has two kinds of resource `qat.intel.com/cy` for cryptography and `qat.intel.com/dc` for compression. If the plugin deployment is a success, the resources can be seen using the commands below. Use /dc is compression is configured. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this section be written in a way that cy
and dc
are not explicitly mentioned. We might have to change these in the future when the 6.5 features become available. Maybe just say "look for qat.intel.com
resources for compression and crypto"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified
machine_configuration/README.md
Outdated
|
||
```$ oc apply -f https://github.com/intel/intel-technology-enabling-for-openshift/blob/main/machine_configuration/100-intel-qat-intel-iommu-on.yaml``` | ||
|
||
Note: This will reboot the worker nodes in the Intel QAT machine config pool one by one to turn on intel_iommu kernel parameter. Rebooting the node is not preferred A better solution is in the plan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This will reboot the worker nodes in the Intel QAT machine config pool one by one to turn on intel_iommu kernel parameter. Rebooting the node is not preferred A better solution is in the plan. | |
Note: This will reboot the worker nodes in the Intel QAT machine config pool one by one to turn on intel_iommu kernel parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the suggested change
|
||
For QAT, intel_iommu is a kernel parameter that needs to be turned on using the following command. | ||
|
||
```$ oc apply -f https://github.com/intel/intel-technology-enabling-for-openshift/blob/main/machine_configuration/100-intel-qat-intel-iommu-on.yaml``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could also add loading vfio-pci
with the desired ids
as parameters since we have to do the booting anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible for us to add the ids as the parameter for the container and set it through the operator? So we don't have to hard code the ids in in the init-container? :-)
Hi Mikko, thanks for your review and suggestions. I presume Chaitanya used the sample here for qat device plugin reference. Are there any implications of setting |
|
Got it, thanks @mythi! |
Addressed this on a separate PR: #111 |
@mythi I think we should have some kind of default value for this to use all the available number of VFs. The end user should not have to know and handle this HW detail. |
device_plugins/deploy_qat.md
Outdated
For more details see https://intel.github.io/quickassist | ||
|
||
## Prerequisite | ||
Before deploying the QAT plugin, make sure iommu MCO is deployed and iommu is enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to work on the PRs in MCO to enable this. and also updated the related readme. And then we can point users to the readme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to enabling IOMMU without a reboot? I've not seen that being possible but it's worth to check.
device_plugins/deploy_qat.md
Outdated
|
||
## Prerequisite | ||
Before deploying the QAT plugin, make sure iommu MCO is deployed and iommu is enabled. | ||
By default, half of the virtual functions are used for compression and the other half is used for cryptography. This default behaviour can be customised via a config map which configures the hardware using sysfs. The config values could be `asym`, `sym` and `dc`, for more details see [cfg_services](https://github.com/torvalds/linux/blob/42e66b1cc3a070671001f8a1e933a80818a192bf/Documentation/ABI/testing/sysfs-driver-qat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be the Prerequisite. I think it should be the feature Gap, I suggest to have an individual section in this document to introduce customers to the detail. And the link to this issue should be added in the section. And above link is good for the stand-alone RHEL server. But for OCP, the ender user should not directly operate the individual node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be the Prerequisite. I think it should be the feature Gap,
Which part you are referring to?
device_plugins/deploy_qat.md
Outdated
The option can be any combination of values described above. | ||
When using the operator for deploying the plugin with provisioning config, use provisioningConfig field for the name of the ConfigMap, then the config is passed to initcontainer through the volume mount. | ||
|
||
There's also a possibility for a node specific congfiguration through passing a nodename via NODE_NAME into initcontainer's environment and passing a node specific profile (qat-$NODE_NAME.conf) via ConfigMap volume mount. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking about whether we should expose the configmap detail to the OCP customers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's totally fine to have it. The default setting without the configmap is also fine but if the cluster admin wants to opt-in for a custom per-node configuration, they can do so with the configmap and by setting the name of the confimap in QatDevicePlugin
custom resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to put it into some kind of Advanced Configuration Section in the readme.
@@ -87,6 +87,23 @@ $ lsmod | grep ast | |||
``` | |||
Ensure that ast driver is not loaded. | |||
|
|||
# Machine Configuration for Provisioning Intel QAT | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this to MCO. And I think we should remove the QAT machine config pool for simplicity purposes.
|
||
For QAT, intel_iommu is a kernel parameter that needs to be turned on using the following command. | ||
|
||
```$ oc apply -f https://github.com/intel/intel-technology-enabling-for-openshift/blob/main/machine_configuration/100-intel-qat-intel-iommu-on.yaml``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible for us to add the ids as the parameter for the container and set it through the operator? So we don't have to hard code the ids in in the init-container? :-)
The PCI IDS of the devices enabled by the cluster admin are passed to the initcontainer so you can have custom logic based on that. |
This flag is for purpose: QAT can also be used for "in kernel" functions through the Linux Kernel Crypto Framework. with
|
|
|
no, that is different case. |
it's not always that simple. There's VFs/PF but there's also how many PFs are available. I understand what you are saying but we'll go with this for now. |
I think for this release it is good enough. Just hope we can optimize it in the future. We can file another issue and track it. :-) |
Yes, future is fine when we also get to hear end-user feedback. |
@mregmi, the machine config readme documentation needs to be updated since we are now not using QAT custom machine config pool. Also, please add a section about the new machine config to load vfio-pci. Thanks! |
@chaitanya1731, can you review the device plugins readme. Thanks! |
Signed-off-by: Manish Regmi <[email protected]>
b1b2a57
to
716a359
Compare
some other PRS will be submitted to replace this PR. So close this one. |
No description provided.