Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pros and Cons of Supporting Remote Direct Memory Access (RDMA) Transport in fbthrift #442

Open
DwyaneShi opened this issue Aug 4, 2021 · 0 comments

Comments

@DwyaneShi
Copy link

As offering ultra-low latency and high throughput, RDMA becomes an emergent technique in modern data centers. Amazon Web Services (AWS) proposed an InfiniBand-like network adapter (i.e., Elastic Fabric Adapter (EFA)) to accelerate HPC and DL applications running on AWS. Microsft Azure and Oracle Cloud adopt commodity RDMA-capable NICs (RNICs) in their clouds to keep competitive. Many companies including Facebook are using RDMA (e.g., GPUDirect RDMA) to accelerate distributed training. However, there are few industrial-grade RPC frameworks to leverage the performance potential of RDMA. Therefore, I am curious about the pros and cons of using RDMA as a fast communication transport in fbthrift. Is it worthy to enable it in fbthrift? Is there anyone who is working on enabling it? What would be the challenges and troubles if someone wants to contribute his/her code to fbthrift to support RDMA?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant