Utility providers #10346

piotrchmiel · 2024-08-28T14:53:33Z

piotrchmiel
Aug 28, 2024

Is there a description somewhere of how utility rxm - rxd providers work? The description https://ofiwg.github.io/libfabric/v1.22.0/man/fi_rxm.7.html in the man page is too laconic for me.

Consider the Verbs provider and the utility RXM provider. How does RXM work with the Verbs provider under the hood? How are QPs (Queue Pairs) handled underneath? Is there a general description of the underlying mechanism? I would like to understand how it works under the hood to get an intuition of its impact on performance.
ibv_devices shows me the device mlx5_0. From the libfabric perspective, I can use mlx5_0 with the RXM utility provider in RDM mode and mlx5_0_dgram with the RXD utility provider. What are the specific differences in behavior between these two cases? Should I use mlx5_0 or mlx5_0_dgram?

aingerson · 2024-09-03T17:49:44Z

aingerson
Sep 3, 2024
Collaborator

I gave a presentation (video) about 5 years ago with a general overview of the utility providers (rxm, rxd, shm). It is a bit out of date and doesn't go into hardware-specific details like you're talking like queue pairs, mlx, etc, but it might help a little?

The big benefit of using mlx5_0_dgram with rxd would come into effect for large scale workloads. So the idea would be that you would use mlx5_0 with rxm until a certain size, until the resources get strained, and then switch to mlx5_dgram with rxd. Right now, that ability to switch internally doesn't exist in OFI but we're looking at adding that as part of the peer provider enhancements that we're currently using to target intranode offload for shm but it could be expanded to handle rxm+rxd+shm integrated all together. As it is right now, however, there isn't a real use implementation for it and rxd in reality is not optimized (or maintained) to be able to be used performantly. I would recommend just sticking with mlx5_0 + rxm as we have tested that for fairly large scale jobs without hitting the limitation, but stay tuned for rxd improvements and offload!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Utility providers #10346

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Utility providers #10346

piotrchmiel Aug 28, 2024

Replies: 1 comment

aingerson Sep 3, 2024 Collaborator

piotrchmiel
Aug 28, 2024

aingerson
Sep 3, 2024
Collaborator