Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark bags that are still being recorded #1597

Open
tonynajjar opened this issue Mar 27, 2024 · 23 comments · May be fixed by #1672
Open

Mark bags that are still being recorded #1597

tonynajjar opened this issue Mar 27, 2024 · 23 comments · May be fixed by #1672
Labels
enhancement New feature or request

Comments

@tonynajjar
Copy link

tonynajjar commented Mar 27, 2024

Description

In ROS1, a bag that was still being recorded was suffixed with .active, it would be nice if in rosbag2 there was something similar. An example use case is a scripts that uploads bags to a server automatically - that script needs to know not to upload a bag if it's still recording.

@tonynajjar tonynajjar added the enhancement New feature or request label Mar 27, 2024
@defunctzombie
Copy link

@MichaelOrlov @emersonknapp any recollection on why this was omitting from rosbag2 or if there's some gotcha with adding support for this?

@MichaelOrlov
Copy link
Contributor

@defunctzombie @tonynajjar I don't know about any prior discussions about having a different suffix for currently writing file.
I guess it might be some corner cases with consistency in filenames in case if recording is abnormally interrupted and we have put current filename to the metadata and haven't renamed it. We are saving metadata at the beginning as well. Also not sure if file renaming could be easily done from C++ std::filesystem.

Also, I think that one of the reasons we didn't follow ROS 1 behavior is because we handling the matter of concern in the question

that script needs to know not to upload a bag if it's still recording

a better way than in ROS 1.

The one can subscribe to the topic "events/write_split" to get a notification when the current file is finished and ready to be processed by third-party script or program.
Rosbag2 will publish a special message on this topic to notify remote nodes that the newly written file is ready and a new one has been created. The old and new filenames are part of this message in those notification.
Yes, we have such a feature in the rosbag2, and sorry that we don't have documentation for it. Please refer to the following part of the code to better understand how to use it.

split_event_pub_ =
node->create_publisher<rosbag2_interfaces::msg::WriteSplitEvent>("events/write_split", 1);
// Start the thread that will publish events
event_publisher_thread_ = std::thread(&RecorderImpl::event_publisher_thread_main, this);
rosbag2_cpp::bag_events::WriterEventCallbacks callbacks;
callbacks.write_split_callback =
[this](rosbag2_cpp::bag_events::BagSplitInfo & info) {
{
std::lock_guard<std::mutex> lock(event_publisher_thread_mutex_);
bag_split_info_ = info;
write_split_has_occurred_ = true;
}
event_publisher_thread_wake_cv_.notify_all();
};
writer_->add_event_callbacks(callbacks);

and
void RecorderImpl::event_publisher_thread_main()
{
RCLCPP_INFO(node->get_logger(), "Event publisher thread: Starting");
while (!event_publisher_thread_should_exit_.load()) {
std::unique_lock<std::mutex> lock(event_publisher_thread_mutex_);
event_publisher_thread_wake_cv_.wait(
lock,
[this] {return event_publisher_thread_should_wake();});
if (write_split_has_occurred_) {
write_split_has_occurred_ = false;
auto message = rosbag2_interfaces::msg::WriteSplitEvent();
message.closed_file = bag_split_info_.closed_file;
message.opened_file = bag_split_info_.opened_file;
try {
split_event_pub_->publish(message);
} catch (const std::exception & e) {
RCLCPP_ERROR_STREAM(
node->get_logger(),
"Failed to publish message on '/events/write_split' topic. \nError: " << e.what());

The key info there is

       auto message = rosbag2_interfaces::msg::WriteSplitEvent(); 
       message.closed_file = bag_split_info_.closed_file; 
       message.opened_file = bag_split_info_.opened_file; 

@tonynajjar
Copy link
Author

The script in question is the foxglove-agent which is closed-source so unfortunately I don't have the option to adapt it. Either the Foxglove guys would need to do so (@defunctzombie ), or rosbag2 has to introduce a differentiator in the file name as this is the only thing foxglove-agent can ignore.

@defunctzombie
Copy link

The script in question is the foxglove-agent which is closed-source so unfortunately I don't have the option to adapt it. Either the Foxglove guys would need to do so (@defunctzombie ), or rosbag2 has to introduce a differentiator in the file name as this is the only thing foxglove-agent can ignore.

One option for ROS users would be a node which listens for these events and moves the "done" recordings to a different folder that is picked up by systems looking for files to upload.

Any system that could previously "watch" directories for changes now needs to be more "ROS-native" and be able to hook into the pubsub mechanism to listen for such events. Also raises some questions about what happens if such a script is started after recording nodes or re-started. Filesystem level indicators (like file extensions done in ROS1) are in some ways a more robust approach.

@MichaelOrlov
Copy link
Contributor

I need to clarify. In general, I don't mind having a feature when a file that is currently in recording to be named with .active suffix.
However, it shall be done correctly and aligned at least with the current implementation of the bag split notification and metadata writing.
I personally don't have a capacity for this feature, but contributions are welcome.

@Timple
Copy link

Timple commented Mar 28, 2024

Perhaps a simple active file inside the rosbag folder? That get's cleaned up when writing is finished?

@MichaelOrlov
Copy link
Contributor

@Timple

Perhaps a simple active file inside the rosbag folder? That get's cleaned up when writing is finished?

That is not clear to me. We certainly shall not try to copy the finished .active file with a different name and then try to delete the original. We need to use rename in place.

@Timple
Copy link

Timple commented Mar 29, 2024

Rosbags in ROS 2 are actually folders with a meta file and a database right?

We could place an empty file called active alongside those two. This file would be cleaned up when finishing the bagfile.

Might be easier on bash/python like scripts which don't have ROS capability.

@MichaelOrlov
Copy link
Contributor

@Timple Thanks for the clarification.
I think this is a good idea and perhaps we can do it with relatively low effort without affecting other functionality.
As a quick design proposal, we can create .active file right after updating metadata at the end of the writer->open() method

init_metadata();
storage_->update_metadata(metadata_);
}

To get the current filename we can use storage_->get_relative_file_path()
Then will need to try to delete the .active file in close() method
if (storage_) {
auto info = std::make_shared<bag_events::BagSplitInfo>();
info->closed_file = storage_->get_relative_file_path();
storage_.reset(); // Destroy storage before calling WRITE_SPLIT callback to make sure that
// bag file was closed before callback call.
callback_manager_.execute_callbacks(bag_events::BagEvent::WRITE_SPLIT, info);
}

and do delete and create a new .active file in the split_bagfile() method
void SequentialWriter::split_bagfile()
{
auto info = std::make_shared<bag_events::BagSplitInfo>();
info->closed_file = storage_->get_relative_file_path();
switch_to_next_storage();
info->opened_file = storage_->get_relative_file_path();

@defunctzombie
Copy link

IMO it is better to use .active for the file being recorded and do a rename on the file rather than introduce additional files and issues with synchronization. The .active extension was a fantastic property in the ros1 recorder; you immediately knew by looking at the file whether it was closed cleanly or not.

@tonynajjar
Copy link
Author

tonynajjar commented Apr 2, 2024

One option for ROS users would be a node which listens for these events and moves the "done" recordings to a different folder that is picked up by systems looking for files to upload.

@defunctzombie I tried that and while that generally can work, it fails (or requires more implementation effort) in corner cases like when shutting down the system: although rosbag2 will conclude and close the bag being recorded, the script won't receive the WriteSplit event and therefore not move the last bag to the "done" folder.

I'm not saying this can't be solved/workedaround but it has enough complexity to say that that's not something any ROS user can simply do -> better solve this at rosbag2 or foxglove-agent's level

@tonynajjar
Copy link
Author

tonynajjar commented Apr 11, 2024

Perhaps a simple active file inside the rosbag folder? That get's cleaned up when writing is finished?

@Timple @MichaelOrlov The problem I see with this is that sometimes the bag folder contains more than one bag file, e.g. when splitting recording by duration. An active file in the folder would generalize that all the rosbag files are actively being recorded when in reality it's only the latest one.
So in this regard I also think that the suffix would be better

@Timple
Copy link

Timple commented Apr 11, 2024

I think you're missing the structure. Rosbags are folders. They are also within folders.
I'm proposing this, where the last file is still being written:

my-folder-with-rosbags/
  - rosbag_2024-01-01/
    - metadatafile.yaml
    - datafile.mcap
  - rosbag_2024-01-02/
    - metadatafile.yaml
    - datafile.mcap
  - rosbag_2024-01-01/
    - metadatafile.yaml
    - datafile.mcap
    - active

@tonynajjar
Copy link
Author

tonynajjar commented Apr 11, 2024

Thanks for the example, I'll give you a counter example:

tony@tony-xmg-22:~$ docker run -it --rm ros:humble-perception
root@f56ce5889eb2:~# ros2 bag record -a -s mcap --max-bag-duration 1
[INFO] [1712828512.674402166] [rosbag2_recorder]: Press SPACE for pausing/resuming
[INFO] [1712828512.675851511] [rosbag2_recorder]: Listening for topics...
[INFO] [1712828512.675879755] [rosbag2_recorder]: Event publisher thread: Starting
[INFO] [1712828512.676785630] [rosbag2_recorder]: Subscribed to topic '/test'
[INFO] [1712828512.677450439] [rosbag2_recorder]: Subscribed to topic '/rosout'
[INFO] [1712828512.677995401] [rosbag2_recorder]: Subscribed to topic '/parameter_events'
[INFO] [1712828512.678360833] [rosbag2_recorder]: Subscribed to topic '/events/write_split'
[INFO] [1712828512.678432689] [rosbag2_recorder]: Recording...
[INFO] [1712828514.341021388] [rosbag2_cpp]: Writing remaining messages from cache to the bag. It may take a while
[INFO] [1712828516.341312147] [rosbag2_cpp]: Writing remaining messages from cache to the bag. It may take a while
[INFO] [1712828518.340502302] [rosbag2_cpp]: Writing remaining messages from cache to the bag. It may take a while
[INFO] [1712828519.123073384] [rosbag2_cpp]: Writing remaining messages from cache to the bag. It may take a while
[INFO] [1712828519.132002960] [rosbag2_recorder]: Event publisher thread: Exiting
[INFO] [1712828519.132170728] [rosbag2_recorder]: Recording stopped
root@f56ce5889eb2:~# ls rosbag2_2024_04_11-09_41_52/
metadata.yaml  rosbag2_2024_04_11-09_41_52_0.mcap  rosbag2_2024_04_11-09_41_52_1.mcap  rosbag2_2024_04_11-09_41_52_2.mcap  rosbag2_2024_04_11-09_41_52_3.mcap

In this case, the directory would look like that:

root@f56ce5889eb2:~# ls rosbag2_2024_04_11-09_41_52/
active metadata.yaml  rosbag2_2024_04_11-09_41_52_0.mcap  rosbag2_2024_04_11-09_41_52_1.mcap  rosbag2_2024_04_11-09_41_52_2.mcap  rosbag2_2024_04_11-09_41_52_3.mcap

At that time, only rosbag2_2024_04_11-09_41_52_3.mcap is still being recorded

@Timple
Copy link

Timple commented Apr 11, 2024

Ahaaa, I did not know that split bagfiles actually ended up in the same directory. (Sorry, should/could have verified).

But this is kind of strange right? The 'total' bagfile is still open. I know we can treat mcap files isolated from their folders, but that's a whole different issue perhaps?

But I don't actually mind the direction we end up with 🙂 . So alternatives are fine for me!

@defunctzombie
Copy link

Rosbags are folders

FWIW I take a different view here. I do not consider the folder a "bag" - to me it is simply that - a file system folder that holds some files. I consider the individual file (the mcap) the "bag" file. I've always found it strange that this "folder" layer was introduced; for me it adds a layer of indirection and nomenclature I find confusing and unnecessary.

@facontidavide
Copy link

Adding .active sonds as the best idea idea.
As an alternative, what about file locks? They are easy to implement using boost:

https://www.boost.org/doc/libs/1_85_0/doc/html/interprocess/synchronization_mechanisms.html#interprocess.synchronization_mechanisms.file_lock

My idea is to mark as "locked" all the files in the directory, until recording is done.

@clalancette
Copy link
Contributor

As an alternative, what about file locks? They are easy to implement using boost:

File locks are not a bad idea, but no boost, please. It is a very heavyweight dependency.

@facontidavide
Copy link

funny, I would have sweared that boost is a common depoendency in the core of ROS. No problem 😄

@defunctzombie
Copy link

My idea is to mark as "locked" all the files in the directory, until recording is done.

If this was only for the application itself to manage this could be ok but if the intent is to have outside applications or folks listing the directory easily understand the situation I think .active is nicer. It also has the property that if the recording process dies you visually see that this specific file was not closed.

@fujitatomoya
Copy link
Contributor

to have outside applications or folks listing the directory easily understand the situation I think .active is nicer.

IMO, i second this with renaming the file without creating extra file nodes. sending the data / logging files to the server or cloud once it is ready is not application specific behavior, this could be also nicer for rosbag2 to go with. we can lock the file either segment or file descriptor, but sounds overkill unless we actually have racy condition to read/write, i guess.

btw, i have a question about metadata.yaml. metadata.yaml is also tagged with .active during recording process? i am not sure if metadata.yaml needs to be updated when the recording is completed.

facontidavide added a commit to facontidavide/rosbag2 that referenced this issue May 22, 2024
facontidavide added a commit to facontidavide/rosbag2 that referenced this issue May 22, 2024
facontidavide added a commit to facontidavide/rosbag2 that referenced this issue May 23, 2024
facontidavide added a commit to facontidavide/rosbag2 that referenced this issue May 23, 2024
Signed-off-by: Davide Faconti <[email protected]>
@MichaelOrlov
Copy link
Contributor

@fujitatomoya @facontidavide @defunctzombie The proposed solution with renaming the currently writing file doesn't fit very well with the current design of the rosbag2, specifically when we are using per-file compression.

Think about a situation when we do split with file compression.

  1. If we will rename file to the bag_file_name.active and then back to bag_file_name.mcap on the file closer, the external script may think that we are done with it and start uploading it to the cloud. However, we are not done with it yet, and the compressor works in the background compressing those files. As soon as the compressor is done, he will try to delete bag_file_name.mcap and update metadata.
  2. If we will rename file to the bag_file_name.active and then compress it and try to delete it after bag_file_name.zip is finished. Then we will have the bag_file_name.active inside the archived bag_file_name.zip file. It will be with *.active extension instead of the *.mcap when we will decompress it on playback and playback will fail.

Please note that currently, we have a buggy behavior with file compression when we are sending notifications about the bag_split event via messages. i.e. we are sending notifications when bag file is closed after the split not when it has been compressed. I have tried to fix it in the #1643 However, don't have time to write a unit tests.
IMO the new work related to this issue shall be done above #1643 anyway.

@fujitatomoya
Copy link
Contributor

IMO the new work related to this issue shall be done above #1643 anyway.

👍 agree, and thanks for sharing insights!

all that said, i think the completion event (renaming system calls on this event) should be including compression operation if compression is required to process. this is just because file is not ready to use yet.

@MichaelOrlov MichaelOrlov linked a pull request Jun 8, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants