Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCP/PROTO: Consider FAST_CMPL flag only for eager protocols #10277

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 21 additions & 2 deletions src/ucp/rndv/proto_rndv.c
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,8 @@ static ucp_proto_select_param_t ucp_proto_rndv_remote_select_param_init(
ucp_memory_info_t mem_info;
uint32_t op_attr_mask;

/* FAST_CMPL flag shouldn't be considered on remote side since it
* affects only one stage protocols (like eager) */
op_attr_mask = ucp_proto_select_op_attr_unpack(select_param->op_attr) &
UCP_OP_ATTR_FLAG_MULTI_SEND;
/* Construct select parameter for the remote protocol */
Expand Down Expand Up @@ -617,6 +619,18 @@ ucp_proto_rndv_ack_init(const ucp_proto_common_init_params_t *init_params,
UCS_LINEAR_FUNC_ZERO, perf_p);
}

static void
ucp_proto_rndv_report_fast_cmpl(const ucp_proto_multi_init_params_t *params)
{
uint32_t op_attr_mask = ucp_proto_select_op_attr_unpack(
params->super.super.select_param->op_attr);

if ((op_attr_mask & UCP_OP_ATTR_FLAG_FAST_CMPL)) {
ucs_warn("RNDV is initialized with FAST_CMPL which is unexpected. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe ucs_debug since fast_cmpl is best effort and UCP users should (generally) not even be aware that a multi step protocol is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That warning means that user passed FAST_CMPL flag to recv operation and in case of choosing RNDV it can cause undesired performance prediction change. So I would prefer to leave it as warning so users can see that and remove FAST_CMPL from recv operation.

"Check whether receive operation was called with that flag.");
}
}

ucs_status_t
ucp_proto_rndv_bulk_init(const ucp_proto_multi_init_params_t *init_params,
const char *op_name, const char *ack_name,
Expand All @@ -630,6 +644,8 @@ ucp_proto_rndv_bulk_init(const ucp_proto_multi_init_params_t *init_params,
const char *proto_name;
ucs_status_t status;

ucp_proto_rndv_report_fast_cmpl(init_params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems we remove FAST_CMPL when we handle RTR. So in which case we sould still have FAST_CMPL here?
maybe it should be an assertion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU that can happen if customer set FAST_CMPL flag for receive operation. In that case RNDV-get would be initialized with FAST_CMPL flag

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so maybe make the warning more clear, or move it to tag_recv_nbx, or mask-out FAST_CMPL flag during tag_recv?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move this warning to tag_recv_nbx (or maybe ucp_proto_rndv_receive_start) but then it would be extra operation on the fast path. And if we want to print a waring/debug message in that case it would cost extra if on fast path. Are we OK with that?


status = ucp_proto_multi_init(init_params, op_name, &bulk_perf, mpriv);
if (status != UCS_OK) {
return status;
Expand Down Expand Up @@ -925,8 +941,11 @@ ucp_proto_rndv_handle_rtr(void *arg, void *data, size_t length, unsigned flags)
/* RTR covers the whole send request - use the send request directly */
ucs_assert(req->flags & UCP_REQUEST_FLAG_PROTO_INITIALIZED);

select_param = &req->send.proto_config->select_param;
op_attr_mask = ucp_proto_select_op_attr_unpack(select_param->op_attr);
select_param = &req->send.proto_config->select_param;
/* Remove FAST_CMPL since it should be considered only for one stage
* protocols (like eager) */
op_attr_mask = ucp_proto_select_op_attr_unpack(select_param->op_attr) &
~UCP_OP_ATTR_FLAG_FAST_CMPL;

if (rtr->size == req->send.state.dt_iter.length) {
/* RTR covers the whole send request - use the send request directly */
Expand Down
Loading