Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulated clock subscriber is built for each node #2659

Open
roncapat opened this issue Oct 30, 2024 · 6 comments
Open

Simulated clock subscriber is built for each node #2659

roncapat opened this issue Oct 30, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@roncapat
Copy link

roncapat commented Oct 30, 2024

While profiling a big application of mine, made by static composition of 20+ nodes in the same executable, enabling use_sim_time makes 20+ threads spawn, each one with a clock subscriber.

clock_executor_thread_ = std::thread(
[this]() {
auto future = cancel_clock_executor_promise_.get_future();
clock_executor_->add_callback_group(clock_callback_group_, node_base_);
clock_executor_->spin_until_future_complete(future);
}
);

Why wouldn't be possible to share among all nodes the clock subscription / the time source? This would bring down the count of clock subscriber threads to just one.

@fujitatomoya
Copy link
Collaborator

Currently TimeSource is managed under Node object, so that clock subscription entity can be created to support simulation time for each Node object. (some nodes use simulation time, some do not.) besides, different QoS for the clock subscription can be set based on node's requirement. Maybe using composition would be different story, but having TimeSource shared between Node objects would generate the coupling issue, that means if the node sharing TimeSource goes offline, all of the other nodes losing clock time source. to be honest, i am not sure if this is good design.

btw, we can disable the clock thread via NodeOption.

/// Set the use_clock_thread flag, return this for parameter idiom.
/**
* If true, a dedicated thread will be used to subscribe to "/clock" topic.
*/
RCLCPP_PUBLIC
NodeOptions &
use_clock_thread(bool use_clock_thread);

@roncapat
Copy link
Author

roncapat commented Oct 31, 2024

@fujitatomoya thank you for your feedback.

I have a couple more questions though...

  • If a node "fails", in a composed application, wouldn't the whole application fail too?
  • Wasn't instead the Intra-Process manager shared across all nodes?
  • Finally, if I disable the clock thread, depending on the executor I choose for the node, may the performance be impacted?

About the possibility of having a mixed configuration, with some nodes using sim time and others not, I think it is totally doable that the ones with sim time enabled share the clock sub if they are in the same process space, while the others that take system time don't.

@fujitatomoya
Copy link
Collaborator

If a node "fails", in a composed application, wouldn't the whole application fail too?

you mean more like crashed by fails? if so, yes. composed nodes are in the same process space, if the process goes down because of one of the node crashes or coredumps, everyone in the same process space will be gone.

Wasn't instead the Intra-Process manager shared across all nodes?

I do not think IntraProcessManager object is shared between nodes, but data communication passes the the virtual address in the process space. so memory is shared within the same process map. (this is still maintained under experimental namespace.)

Finally, if I disable the clock thread, depending on the executor I choose for the node, may the performance be impacted?

this is good question. and yes. if clock thread is disabled, you need to spin this node with executor to update the simulation clock data.

I think it is totally doable that the ones with sim time enabled share the clock sub if they are in the same process space, while the others that take system time don't.

I am not against this, we always can explore the possibility for better user experience. but i am just not sure enough how we can design this in detail. for example, there could be multiple contexts with different domain, in that case we just cannot bind the process global time source to every nodes in the process. and maybe i am missing more here...

@clalancette
Copy link
Contributor

I do not think IntraProcessManager object is shared between nodes,

It is. There is one IntraProcessManager per Context, which is how we end up being able to manage all of this between Nodes (you cannot do intra-process between contexts, which also implies you cannot do intra-process between different domain IDs).

I am not against this, we always can explore the possibility for better user experience. but i am just not sure enough how we can design this in detail. for example, there could be multiple contexts with different domain, in that case we just cannot bind the process global time source to every nodes in the process. and maybe i am missing more here...

I don't think we should try to design this sharing around multiple contexts; that is too hard. We could consider having all Nodes within a single Context share a clock thread, i.e. the clock thread would live in the Context (much like the IntraProcessManager). Then the Nodes in that Context could pull the time out of the Context. However, the problem with that design is that the Context has no notion whatsoever about subscriptions, so something would need to "feed" that Clock thread with data. That could be one of the Nodes, but which one? And what if that one goes away? (there may be other problems with that design as well)

@mjcarroll
Copy link
Member

We discussed this in the triage meeting yesterday. The suggested solution would be to add a way to pass a TimeSource via the NodeOptions to allow users to override the internal timesource on a per-node basis.

For our generic composition containers, we could add a global flag such that all components either share or don't share a single time source.

For more complicated layouts (some nodes using sime, but not all), then these could be manage via manual composition like you are currently doing.

Would this be a feature that you are interested in adding for the rolling/kilted release?

@mjcarroll mjcarroll added the enhancement New feature or request label Nov 8, 2024
@roncapat
Copy link
Author

roncapat commented Nov 8, 2024

The manual override solution via NodeParameters is something I would like very much.

I further propose to split implementation in two subsequent steps/smaller PRs:

  • allow to pass TimeSource via NodeParameters
  • Make standard containers optionally share TimeSource as you are proposing

In this moment, I do not know if I may have available bandwidth to implement this; maybe I can start at least to prototype the first step and share with you the draft PR for further discussion if you are ok with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants