Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROS tf frame confusion #3

Open
masterhui opened this issue Feb 24, 2022 · 12 comments
Open

ROS tf frame confusion #3

masterhui opened this issue Feb 24, 2022 · 12 comments

Comments

@masterhui
Copy link

Hello Jan
First of all, great work on your paper, I really appreciate you published your code. Currently, I am using F-LOAM for LIDAR odomery on a UGV equipped with Ouster OS0-64. I discovered your fork recently (https://github.com/JanQuenzel/floam), which solved the increasing latency problem I was having with the original code. However, I really would like to evaluate lidar_mars_registration, as it promises even better results. I successfully compiled your package in my ROS workspace and was able to confirm your results from the paper using the public data sets.

Now, I would like to replace my LIDAR odometry solution (/rtabmap/icp_odometry using your F-LOAM fork) with lidar_mars_registration. Here is my current tf-tree:

tf_rtabmap_icp_odometry

As you can see, the map-->odom transform comes from rtabmap, taking care of loop closures. Then odom-->base_footprint should be supplied by the LIDAR odometry. I tried the following settings in the "tfs" section of the lidar_mars_registration .cfg file:

// tfs
baselink_frame: "base_link"
sensor_frame: "os_lidar"
world_frame: "odom"
map_frame: "base_footprint"

However, this doesn't work. Even though lidar_mars_registration is subscribed to the correct pointcloud topic and it also correctly processes the scans, the published tfs are wrong. Running roswtf to debug, I get the following errors:

ERROR TF multiple authority contention:
 * node [/mars/lidar_registration] publishing transform [base_footprint] with parent [map] already published by node [/rtabmap/rtabmap]
 * node [/rtabmap/rtabmap] publishing transform [base_footprint] with parent [map] already published by node [/mars/lidar_registration]

Any idea where I'm going wrong? Your help is greatly appreciated!

@JanQuenzel
Copy link
Contributor

Hey there,
in its current form the package does not publish the matching transforms.
Currently it publishes only world_frame->map_frame as Identity or a fixed rotation to align the initial base_link orientation to be somewhat parallel to level ground at startup. See here and here.

The current base_link pose (name in cfg: corrected_baselink_frame) wrt the first base_link pose (name in cfg: map_frame) is only published hereas a nav_msgs::Odometry message and not send as a transform to tf. In our system, we have an EKF subscribed to IMU and these odom msgs which then sends a filtered transform to tf.

So to make it work for you, I think you have to first add sending the odom transform also to tf e.g. after here adding something like the following lines:

const geometry_msgs::Transform map_baselink = sophusToTransform( interp_pose_firstMap_baselink );
const geometry_msgs::TransformStamped stampedTransformMapBaselink =
toTransformStamped( map_baselink, input_stamp+validityTimeOffset, m_map_frame, m_corrected_baselink_frame );
m_tf_broadcaster->sendTransform(stampedTransformMapBaselink);

From what I saw wrt rtabmap on here at paragraph 4.3.6, the odometry system should provide the transform odom -> baselink.

For that I think the following settings should work:
corrected_baselink_frame: "base_link"
baselink_frame: "base_link"
world_frame: "odom"
map_frame: "odom_aligned"
It then should send: odom -> odom_aligned -> base_link

Also:
roswtf says that rtabmap still tries to publish that transform. So the floam / icp_odometry was not disabled?

@masterhui
Copy link
Author

Hi Jan
Thank you for your reply. I added the publisher code as you suggested, the package compiles fine. However, I wasn't successful in integrating it with rtabmap yet. I keep getting this error

ERROR TF multiple authority contention:
 * node [/mars/lidar_registration] publishing transform [base_footprint] with parent [odom] already published by node [/rtabmap/rtabmap]
 * node [/rtabmap/rtabmap] publishing transform [base_footprint] with parent [map] already published by node [/mars/lidar_registration]

even though the rtabmap/icp_odometry node isn't running. I suppose this is a specialty of rtabmap. Hence, reaching out to @matlabbe who wrote the code:
Hey Mat, I am attempting to evaluate lidar_mars_registration as ICP odometry solution for rtabmap (see tf tree above). Remember our discussion regarding F-LOAM integration (see http://official-rtab-map-forum.206.s1.nabble.com/icp-odometry-with-LOAM-crash-td8261.html#a8618)? When I was using external F-LOAM as LIDAR odometry (https://github.com/flynneva/floam, before you directly integrated F-LOAM into rtabmap), I only had to set

    <node pkg="nodelet" type="nodelet" name="odom_node"
      args="load floam/odom/OdomEstimationNode manager" output="screen">
      <param name="frame_id" value="base_footprint" />
      <param name="parent_frame_id" value="odom" />
    </node>

to make it work. When using Jan's lidar_mars_registration instead, rtabmap always seems to publish map-->odom, even though rtabmap/icp_odometry isn't running (see roswtf error above). Following Jan's advice, I am using these settings with lidar_mars_registration:

    // tfs
    corrected_sensor_frame: "os_lidar_map"
    corrected_baselink_frame: "base_footprint"
    baselink_gps_frame: "base_link_gps" // unused here
    baselink_frame: "base_footprint"
    sensor_frame: "os_lidar"
    world_frame: "odom"
    map_frame: "odom_aligned"
    use_tf_for_field_baselink: false
    use_tf_for_baselink_sensor: false
    center_transform: false

I feel like I am very close, registration works beautifully, I just cannot get tf tree to behave. It should be:
map-->odom-->base_footprint
where map-->odom is published by rtabmap and odom-->base_footprint is published by lidar_mars_registration. Any idea what's going on here? As always, appreciate your help :-)

@matlabbe
Copy link

matlabbe commented Mar 6, 2022

Can you show your current tf tree? with rosrun tf2_tools view_frames, not sure I see the dependence between all those frames:

corrected_sensor_frame: "os_lidar_map"
corrected_baselink_frame: "base_footprint"
baselink_gps_frame: "base_link_gps" // unused here
baselink_frame: "base_footprint"
sensor_frame: "os_lidar"
world_frame: "odom"
map_frame: "odom_aligned"

If you start only rtabmap node, rtabmap will publish TF between /map (map_frame_id parameter) and the odometry frame set in the odometry topic (or the frame set in odom_frame_id parameter if set). frame_id would be also base_footprint based on your lidar_mars config.

where map-->odom is published by rtabmap and odom-->base_footprint is published by lidar_mars_registration. Any idea what's going on here? As always, appreciate your help :-)

This should indeed do that. Based on @JanQuenzel last comment, you would normally end up with map -> odom -> odom_aligned -> base_footprint in which only map->odom is published by rtabmap. Make sure indeed that icp_odometry is not started.

cheers,
Mathieu

@masterhui
Copy link
Author

Hello @matlabbe and @JanQuenzel
Sorry for my delayed reply, I was on holiday when I received your message Mat and this week was crazy busy so I didn't get back to this until now. Looking at what you wrote Mat, I investigated what the problem is. Your notes regarding map_frame_id and odom_frame_id parameter got me on the right track. I noticed that my rtabmap launch file didn't specify these parameters, as using the default values was OK for me until now. However, in the documentation on http://wiki.ros.org/rtabmap_ros I read that If empty, rtabmap will subscribe to odom topic to get odometry. If set, odometry is got from [tf](http://wiki.ros.org/tf) I have now set them as follows in my launch file:

<param name="map_frame_id"         type="string" value="map"/>
<param name="odom_frame_id"        type="string" value="odom"/>

and this has fixed the tf issues I was having. I have had not time to properly evaluate lidar_mars_registration yet, but just wanted to share this progress with you guys. I will report back later next week on my findings.

Have a great weekend, cheers :-)

@masterhui
Copy link
Author

Hi Jan, I tested lidar_mars_registration with my existing Ouster OS0-64 datasets, with mixed results. The generated odom for the indoor datasets (factory type building) has a lot of jitter, with the position jumping around quite a bit. When playing around with the parameters in the config file I noticed that the jitter is reduced when setting use_adaptive: false. Also, the localization at some point fails completely, where F-LOAM performs flawlessly.

I realize that there are many parameters that could be tuned, do you have any tutorials or documentation written up for this yet? I'd be happy to collaborate with you and share my bag and config, but maybe this issue thread is not the right place for discussion? Please consider the original issue ("tf frame confusion") as closed.

@JanQuenzel
Copy link
Contributor

Hey @masterhui , good to hear that at least the original problem is fixed. :)
Currently there is no tutorial or documentation wrt parameter tuning.
In general the parameters for all tested datasets were the same with probably an exception for self-filters to remove measurements on a driving car / drone.

Can you be more specific in what situation it failed? Distance to surroundings, what happened in the situation etc?

What kind of jitter do you mean? Time-wise or pose-wise?
In both cases you might check at which frequency you get the odometry outputs with rostopic hz.
For pose-wise jittering you could first try to increase max_iterations.
For time-wise you could increase min_keyframe_distance to create less keyframes as they take more time to create and can result in frame drop.
Frame drops can be a problem too. Ideally the time between scans is always constant. If not, the Normal Equations may become ill-conditioned which can result in a large update step for some knot. In rare cases, that can lead within some scans to failure. A while ago I tried a soft-constraint on the acceleration that improves the condition in such situations and thus should reduce jittering in poses too. I'll see if I can port that back.

In our indoor situations (e.g. factory or storage type) we typically have some safety margin (> 1 m) around the drone. As such I used, as a compromise for indoor and outdoor, cell side lengths of 4/2/1m for grids and similar values for the Lattice. If I remember correctly, in the DRZ hall 4m ones are used very very rarely during registration. For narrow passages 4/2/1m is too large. You can reduce it by adapting the MAP_SIDE, NUM_CELLS and NUM_LEVELS from include/MarsMapParameters.h L47-L50. MAP_SIDE determines the side length of the local map on the coarsest scale. The coarsest cell side in meter is simply MAP_SIDE / NUM_CELLS. Finer cell sides are then subdivided by 2 per level. Setting MAP_SIDE to 128 would thus give you 2/1/0.5m.
After changing, don't forget to rerun build / make.

I recently found a bug within the Lattice code that the directly corresponding cell was not included when looking up correspondences. I'll push the fix for that within the next days.

@masterhui
Copy link
Author

Hey @JanQuenzel,
thank you for your detailed reply.

Can you be more specific in what situation it failed? Distance to surroundings, what happened in the situation etc?

Let me describe the test setup in more detail: The OS0-64 is mounted to a tracked robot (footprint 1.0x0.4m), which is remote piloted around a large indoor warehouse type facility. The whole course is a loop, it is first driven clockwise, then counter-clockwise, then clockwise again. During each lap, the robot also enters an office, which is pretty tight, and also performs numerous rotations in there on carpet, which causes a lot of vibrations. It also climbs over a pyramid shaped obstacle which is approx. 1m tall and has 45° sloped ramps. In addition, it performs 360° rotations every 20m.

What kind of jitter do you mean? Time-wise or pose-wise?

Yes I am talking about pose jitter. Looking at the output of rostopic hz, the odom rate is pretty solid at 10Hz (same as Ouster).

I have been playing with the parameters you sent, and things have improved a great deal compared to the default settings. The best setup I found uses MAP_SIDE=64, NUM_CELLS=32, NUM_LEVELS=2 in MarsMapParameters.h and max_iterations: 6 in mars.cfg/registration_adaptor section. The issue is that this works great in the large warehouse setting, with CPU load still not saturating one of the 6 NUC cores. However, when the robot enters the office, the pose litter gets very large:
Screenshot 1

During lap 3, the localization finally completely fails in the office:
Screenshot 2

I know this is a very challenging torture course, but it's the benchmark I am using to compare it to F-LOAM performance, which deals with it pretty well overall.

Looking forward to the fix you mentioned, I will give it a try once you push the commit. I could also try to extract the Ouster data from my rosbag if you would like to give it a go yourself, I just first have to strip some of the other topics (like uncompressed RGB, blowing up the size to 600GB).

One last question: Where can I disable or tune down the log output? I tried various places reduce verbosity, but did not succeed so far.

Cheers!

@JanQuenzel
Copy link
Contributor

JanQuenzel commented Apr 5, 2022

​​​​​That really sounds like a torture circuit :D
At what angular rate are you performing the rotations and the tilting on the ramps?
Does it make a difference, if the ouster runs at 20Hz?

You could also try to compensate orientation distortion within scans with the IMU.
There are currently two ways implemented:

  1. use orientation within IMU message
  2. use gyro measurements within IMU messages directly

imu_topic needs to be set correctly, and compensate_orientation: true to enable both, while depending on use_gyro_directly either first or second option is used.

For planar sensor motion it shouldn't be necessary, but in case of tilting, it may bring some benefit to enable:

keyframe_use_rotation: false
min_keyframe_rotation: 5
use_closest_keyframe: false

Then a keyframe will also be created if roll/pitch wrt to last keyframe changes above min_keyframe_rotation degrees.
For close-quarter loopy scenarios it might be worthwhile to also enable use_closest_keyframe, so the thresholds are compared against the closest keyframe instead of the last.

One thing that surprises me are the parameters you chose.
As it means the local map is 64m wide and thus should have 2m on the coarse, but 1m on the finer level. The difference towards 128m, 32cells and 3 levels shouldn't be too big. Is that again with disabled adaptiveness?

I also added an option to send the tf transform between map and base_link via the config option send_map_baselink_tf.

Most output should now be reduced, since I changed most LOG(INFO) to LOG(1). Now it should only say that it waits for data.
I added a launch file with this as the standard option set. Alternatively you can change the launch parameter "-v INFO" to output more info. This still records the output to the log file even if it isn't seen on the console.

Furthermore, I added the soft-constraint on acceleration, which can be adjusted via:

sca_use_constraint: false
sca_constraint_factor: 0.01

I'd be careful with the factor, since for very rapid movement it could actually reduce accuracy as it constrains high acceleration on both rotation and translation. So far, I haven't quantified the influence by rerunning some of the experiments, but in general it appeared more consistent.

@masterhui
Copy link
Author

Hi Jan
thank you for the code changes, I was able to pull the latest commit and compile without issues. I adapted live.cfg as follows:

  • Adjust topics
  • Adjust frame names
  • Set use_tf_for_field_baselink and use_tf_for_baselink_sensor to false

Testing with the same ROS bag as before (mixed factory and office environment), this gave me the same results as with the previous (patched) version of lidar_mars_registartion, i.e. pose got lost after entering the tight office space.

I then started to play around with different parameters.

  • Added orientation distortion compensation using IMU data. I have two IMU sensor data sources in the bag, which I both tried: Ouster built-in IMU and Microsoft Azure built-in IMU. First I only enabled compensate_orientation: true, then I also added use_gyro_directly: true. This did not help, localization gets lost pretty much in the same place as before.
  • In seperate experiments, I set keyframe_use_rotation: true, and then later also set use_closest_keyframe: true. This also did not bring any significant improvements.
  • I also tried sca_use_constraint: true, keeping the original value of sca_constraint_factor: 0.01. Same thing, localization gets lost pretty much in the same place as before.
  • I also played around some more with smaller values for map size parameters. From the last used value (64, 32, 2) I switched to (32, 16, 2), also to no avail.

Here is a screenshot of localization getting lost with all the new settings enabled:
Screenshot from 2022-04-06 15-06-08

Thank you for your hard work anyway, I will keep trying if you have any more ideas for me.

Regards

@JanQuenzel
Copy link
Contributor

Now I am really curious what the reason is. If the offer still stands, you can also send me part of the ouster data in question and I'll take a deeper look into why it looses track.

Another quick thing:
Could you test if it also happens, if you start shortly before that tight space? Just to be sure, that it does not happen due to some bug in moving of the local map or so...

Regarding orientation compensation:
I dont think that the ouster driver sets the orientation at all, so without use_gyro_directly no compensation would be done either way.
Furthermore, I assume that base_link is at the IMU ( which is true for our drones ) .
With use_tf_for_baselink_sensor set to false, m_cur_pose_baselink_sensor will be identity and supplied to compensateOrientation in here.
Thus, the IMU data is not correctly rotated and may actually distort it further.

Best, Jan

@masterhui
Copy link
Author

Hi Jan, I have sent you a download link for the bag file via Email :-)

@masterhui
Copy link
Author

As you suggested, I tried to start bag playback at 470s (rosbag play --start=470), which is the moment just before the robot enters the tight office space --> same result

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants