Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCP/PROTO: Consider RNDV_PERF_DIFF #10292

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ivankochin
Copy link
Contributor

@ivankochin ivankochin commented Nov 12, 2024

What?

Applies UCX_RNDV_PERF_DIFF setting effect for protov2.

Why?

This control effect was removed during introducing perf-factors logic. This PR brings it back.

That how it looks like with that patch:
image

@ivankochin
Copy link
Contributor Author

@yosefe @brminich currently patch improves all RNDV protocols, but maybe it worth to think about applying this change to RNDV-get/put only since there can be cases when RMA is unsupported for some reason and on big sizes there would be comparison between Eager and RNDV through AM and BW of latter would be slightly higher due to RNDV_PERF_DIFF. WDYT?

@ivankochin ivankochin self-assigned this Nov 12, 2024
@brminich
Copy link
Contributor

@yosefe @brminich currently patch improves all RNDV protocols, but maybe it worth to think about applying this change to RNDV-get/put only since there can be cases when RMA is unsupported for some reason and on big sizes there would be comparison between Eager and RNDV through AM and BW of latter would be slightly higher due to RNDV_PERF_DIFF. WDYT?

I’d leave it for all protocols for simplicity. Additionally, we have PPLN protocols, and even AM-based RNDV may be preferable to eager due to its lower memory consumption.

@ivankochin
Copy link
Contributor Author

@yosefe @brminich currently patch improves all RNDV protocols, but maybe it worth to think about applying this change to RNDV-get/put only since there can be cases when RMA is unsupported for some reason and on big sizes there would be comparison between Eager and RNDV through AM and BW of latter would be slightly higher due to RNDV_PERF_DIFF. WDYT?

I’d leave it for all protocols for simplicity. Additionally, we have PPLN protocols, and even AM-based RNDV may be preferable to eager due to its lower memory consumption.

I thought AM-based RNDV and PPLN protocols consume same amount of memory as eager, aren't they? Do AM-based protocols send message directly to user-provided buffer?

@brminich
Copy link
Contributor

@yosefe @brminich currently patch improves all RNDV protocols, but maybe it worth to think about applying this change to RNDV-get/put only since there can be cases when RMA is unsupported for some reason and on big sizes there would be comparison between Eager and RNDV through AM and BW of latter would be slightly higher due to RNDV_PERF_DIFF. WDYT?

I’d leave it for all protocols for simplicity. Additionally, we have PPLN protocols, and even AM-based RNDV may be preferable to eager due to its lower memory consumption.

I thought AM-based RNDV and PPLN protocols consume same amount of memory as eager, aren't they? Do AM-based protocols send message directly to user-provided buffer?

In case of tag matching API eager messages are queued on the receiver until receive operation is invoked. Such eager fragments may consume significant amount of memory on the receiver. With RNDV data transfer is started only when receiver is ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants