Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpp to MPI type mapping improvements #3495

Open
wants to merge 36 commits into
base: main
Choose a base branch
from

Conversation

schnellerhase
Copy link
Contributor

@schnellerhase schnellerhase commented Nov 2, 2024

mpi_type is redesigned in favor of a type trait based system that is easier to extend. Especially this no longer requires a function call to obtain the mpi type, the interface change is

dolfinx::MPI::mpi_type<T>() -> dolfinx::MPI::mpi_t<T>

Additionally the mapping is extended to previously non matched types.

Oddly the mapping of the fixed width std types to their MPI equivalent, for example std::int32_t -> MPI_INT32_T seems to break on some platforms. No clue why, because using the standard C type namings works, so are the MPI types not pointing to the same ones?

@schnellerhase schnellerhase marked this pull request as ready for review November 3, 2024 11:04
@jhale
Copy link
Member

jhale commented Nov 5, 2024

Could you be more specific about which platforms the fixed width conversion fails? And what is the compile error?

@jhale
Copy link
Member

jhale commented Nov 5, 2024

uint_8 is unsigned char on some platforms see e.g. https://en.cppreference.com/w/cpp/types/integer leading to the error with the redefinition

@schnellerhase
Copy link
Contributor Author

schnellerhase commented Nov 5, 2024

Yes, just missed that one to revert :)

The errors that you see now in the CI were the ones I discovered. Locally I couldn't reproduce it, but I also didn't check other versions.

long long unsigned int is somehow not matched here.

@schnellerhase
Copy link
Contributor Author

C standard seems to only guarantee at least 64 bit width of unsigned long long not exactly equal, see https://en.cppreference.com/w/cpp/language/types.

However I couldn't find anything hinting a non 64-bit width on these platforms.

@garth-wells
Copy link
Member

For integers, to avoid any ambiguity I'd suggest providing interfaces only for fixed-width integers with the exception of int. We shouldn't (and don't) use long int, long long int, etc, in DOLFINx.

1, MPI_INT32_T, comm0, &sizes_request);
MPI_Ineighbor_alltoall(send_sizes.data(), 1,
dolfinx::MPI::mpi_t<std::int32_t>, recv_sizes.data(),
1, dolfinx::MPI::mpi_t<std::int32_t>, comm0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that dolfinx::MPI::mpi_t<std::int32_t> adds anything over MPI_INT32_T. The latter is simpler.

dolfinx::MPI::mpi_t is helpful where we have templated over the type being communicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the perfect interface for sure, but it allows us to stick to one type system in the code base.

We could also go for something like dolfinx::MPI::mpi_t<sendsizes::value_type>, which removes the possibility of a type mismatch, but increases code complexity for the caller.

@schnellerhase
Copy link
Contributor Author

For integers, to avoid any ambiguity I'd suggest providing interfaces only for fixed-width integers with the exception of int. We shouldn't (and don't) use long int, long long int, etc, in DOLFINx.

Agreed, if we can get it to work I'd be happy to see the fixed width version used as well. My understanding was always that the fixed width types of std are just replaced with the corresponding C-type on the machine, so if for example unsigned long long is 64 bit, std::uint64_t would just be a different name for it.

Its most certainly caused by this type def https://github.com/FEniCS/dolfinx/blob/main/cpp/dolfinx/graph/partitioners.cpp#L622 - but we need it as Kahip expects that type https://github.com/KaHIP/KaHIP/blob/master/parallel/parallel_src/interface/parhip_interface.h.

Anybody got an idea whats causing this?

@jhale
Copy link
Member

jhale commented Nov 10, 2024

Regarding the kahip interface, does anyone remember why we are we using unsigned long long there?

Edit: I missed the link to the kahip source. Does kahip have a native C++ interface? Seems odd to have to go through a C interface.

@schnellerhase
Copy link
Contributor Author

Don't think there is any other interface, at least it is the only occurrence of ParHIPPartitionKWay in the repo 😄

@schnellerhase
Copy link
Contributor Author

Couldn't get the long long unsigned int to work on all platforms. Switched to the fixed width overloads and made the single occasion of the long long unsigned int for Kahip explicit with MPI_UNSIGNED_LONG_LONG

@jhale
Copy link
Member

jhale commented Nov 11, 2024

This is the right approach if KaHiP is really using long long unsigned underneath. Still odd they don't provide a native C++ call...

Looks good to me. @garth-wells you still had an open review comment about whether it's best just to use the MPI types directly in some cases. Could you comment further on that now?

We should consider fixed width floats when we move to C++23 https://en.cppreference.com/w/cpp/types/floating-point

cpp/dolfinx/common/MPI.h Outdated Show resolved Hide resolved
cpp/dolfinx/common/MPI.h Outdated Show resolved Hide resolved
cpp/dolfinx/common/MPI.h Outdated Show resolved Hide resolved
cpp/dolfinx/common/MPI.h Outdated Show resolved Hide resolved
@IgorBaratta
Copy link
Member

This is the right approach if KaHiP is really using long long unsigned underneath. Still odd they don't provide a native C++ call...

https://github.com/KaHIP/KaHIP/blob/master/parallel/parallel_src/interface/parhip_interface.h#L15
KaHiP uses unsigned long long for its interface and also internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants