Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the parcelport TCP causes the memory usage to continuously increase, and I'm not sure if it is a memory leak #6574

Open
phil-skillwon opened this issue Nov 17, 2024 · 1 comment

Comments

@phil-skillwon
Copy link

Recently, I’ve been validating the feasibility of using HPX as the foundational framework for our team's signal processing algorithm development. However, during testing, I noticed what seems to be a memory leak issue with HPX.

So, I wrote a separate test program using the TCP parcelport to test data interaction across multiple nodes and discovered what looked like a memory leak. But I’m not entirely sure, so I’m seeking your help here.

I used two nodes, running on two different hosts (Ubuntu 22.04 LTS), and the test code is as follows:

static vector<std::byte> getData(const size_t sz) 
{
    vector<std::byte> data(sz, (std::byte)(0xFF));
    
    return data;
}

HPX_PLAIN_ACTION(getData, GetDataAction);

int hpx_main(int argc, char* argv[])
{
    hpx::error_code ec = hpx::make_success_code();
    std::vector<hpx::id_type> localities = hpx::find_all_localities(ec);
    if (hpx::error::success != ec.value()) 
    {
        printf("find_all_localities executed failed, %s\n", ec.get_message().c_str());
        return -1;
    }

    if (localities.size() < 2) 
    {
        printf("this program requires at least 2 localities\n");
        return -2;
    }

    printf("num of localities: %ld\n", localities.size());
    for (const auto& loc : localities) 
    {
        hpx::naming::gid_type gid = loc.get_gid();
        std::string address = hpx::get_locality_name(loc).get();
        std::uint32_t localityId = hpx::naming::get_locality_id_from_gid(gid);

        printf("locality id: %d\n", localityId);
        printf("locality name: %s, id: %08X\n", address.c_str(), localityId);
    }

    size_t dataSize = 960256;
    while (true) 
    {
        hpx::this_thread::sleep_for(1400us);

        auto dataNode0 = hpx::sync<GetDataAction>(localities[0], dataSize);
        auto dataNode1 = hpx::sync<GetDataAction>(localities[1], dataSize);

        printf("node0, data size: %ld, node1, data size: %ld\n", dataNode0.size(), dataNode1.size());
    }
    
    return hpx::finalize();
}

Node 0 is the root node. I observed the memory usage on both Node 0 and Node 1. Both hosts have 8GB of physical memory.

When the test program started, the memory usage on both nodes was about 0.4%. But after 1 hour, the memory usage on Node 0 increased to 0.7%, while the memory usage on Node 1 remained at 0.4%.

After about 24 hours, the memory usage on Node 0 reached 1.9%, while the memory usage on Node 1 remained at 0.4%. This looks like a memory leak.

Later, I modified the test code to run a single process on one host, and there was no increase in memory usage.

However, my test code is extremely simple, so it’s unlikely that the issue is due to my code. Could you help analyze this problem?

@hkaiser
Copy link
Member

hkaiser commented Nov 17, 2024

@phil-skillwon Could you compile your test code (and possibly HPX) with -DHPX_WITH_SANITIZERS=On, please? This should report memory leaks, if any. I'd be more than happy to assist in diagnosing and fixing those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants