Prevent tf message filter from preempting the oldest message while it is waiting for transforms. #544
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I ran across an issue where the tf message filter drops all incoming messages if the effective wait time for the tf transform is longer than the time between incoming messages. When a new message comes in and the queue is full, due to messages waiting for transforms, the oldest message is removed, preempting it. This is especially apparent when the queue size is 1.
So, if the rate of the subscribed messages is high, like 50hz, the max wait time is only effectively 20ms even if the tf timeout is set higher in the message filter.
Increasing the queue size can mitigate this, but there are cases where we want to be able to set the queue size very low or even to 1.
This MR tweaks the logic of the message filter queue to not preempt the oldest message waiting for a transform (which is the first that is likely to succeed), but instead preempt and drop the 2nd oldest. If there is no 2nd oldest because the queue size is 1, then the current message is dropped instead of adding it to the queue.
This modification at least ensures that some messages get through as long as the effective wait time is less than the configured timeout.
This was something that came up while using slam_toolbox on specific robot platform that had tf transforms with a small amount of latency: SteveMacenski/slam_toolbox#516 @SteveMacenski
Any thoughts on this?