Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/GGA: rkey resolution for many MDs #10236

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

evgeny-leksikov
Copy link
Contributor

What

rkey resolution for many MDs in GGA transport

Why ?

#9993 (comment)

How ?

  1. rkey has cached part for quick resolution on fast path
  2. rkey has common part independent on MD
  3. on usage on new MD, resolution mechanism allocates MD part and adds it to md->hash
  4. also there is allocation of a helper structure which links MD to rkey and used in md_close to cleanup rkeys

@evgeny-leksikov evgeny-leksikov added the WIP-DNM Work in progress / Do not review label Oct 17, 2024
@evgeny-leksikov
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@evgeny-leksikov evgeny-leksikov removed the WIP-DNM Work in progress / Do not review label Oct 21, 2024
src/uct/ib/mlx5/gga/gga_mlx5.c Outdated Show resolved Hide resolved
md_attr->rkey_packed_size = md_attr->exported_mkey_packed_size;
return UCS_OK;
}
static pthread_mutex_t uct_gga_mlx5_rkeys_locker = PTHREAD_MUTEX_INITIALIZER;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use "_lock" or "_mutex" to conform with the codebase
Maybe store it in the component uct_gga_component?

Copy link
Contributor Author

@evgeny-leksikov evgeny-leksikov Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe store it in the component uct_gga_component?

I also thought about it, but don't found much benefit, just more code related to new type/struct


*rkey_p = (uintptr_t)rkey_handle;
*handle_p = NULL;
return UCS_OK;
}

static ucs_status_t
uct_gga_mlx5_md_put_rkey(uct_gga_mlx5_md_t *md,
const uct_gga_mlx5_rkey_hash_key_t key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: since key is primitive const is redundant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's more for readability and clean API + uct_gga_mlx5_rkey_hash_key_t might be changed in future, so this place won't require changes


iter = kh_put(resolved_rkeys, &md->rkey_cache, key, &ret);
if (ret == UCS_KH_PUT_FAILED) {
ucs_error("md %p: failed to add rkey to hash", md);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and in ucs_assert I propose to print key value

src/uct/ib/mlx5/gga/gga_mlx5.c Show resolved Hide resolved
Comment on lines +180 to +181
return ucs_unlikely(iter == kh_end(&md->rkey_cache)) ? NULL :
kh_val(&md->rkey_cache, iter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe:

Suggested change
return ucs_unlikely(iter == kh_end(&md->rkey_cache)) ? NULL :
kh_val(&md->rkey_cache, iter);
return ucs_unlikely(iter == kh_end(&md->rkey_cache)) ?
NULL : kh_val(&md->rkey_cache, iter);

@@ -192,57 +280,124 @@ static uct_component_t uct_gga_component = {
};

static UCS_F_ALWAYS_INLINE
void uct_gga_mlx5_rkey_trace(uct_ib_mlx5_md_t *md,
void uct_gga_mlx5_rkey_trace(uct_gga_mlx5_md_t *md,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: void can be on previous line like in other places

}

attach_params.field_mask = 0;
pthread_mutex_lock(&uct_gga_mlx5_rkeys_lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use the same mutex which is protecting the hash map?
If it's not necessary, then we may consider having static local mutex protecting only mem_attach call, for better scalability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants