Data Overview

File Structure

└── <dataset>/
    ├── metadata/
    │   ├── <dataset>.json
    │   └── <dataset>-os-metadata.json
    ├── input/
    │   ├── RAW-<dataset>M-01.bag
    │   └── RAW-<dataset>U-01.bag
    ├── processed/
    │   ├── <dataset>U-01.bag
    │   ├── <dataset>U-02.bag
    │   └── <dataset>M-01.bag
    └── ground_truth/
        ├── labels/
        │   ├── <dataset>U.h5
        │   └── <dataset>M.h5
        ├── measurements/
        │   ├── <dataset>M-gt.json
        │   ├── <dataset>M-01-map.png
        │   └── <dataset>M-02-map.png
        ├── pcds/
        │   ├── <dataset>-01-tree11.pcd
        │   ├── <dataset>-02-tree22.pcd
        │   └── <dataset>-03-tree33.pcd
        └── baseline/
            ├── <dataset>-DBCRE.yaml
            └── <dataset>-SLOAM.yaml

Available <dataset> tags include:

  • Forestry: VAT-0723, VAT-1022, WSF-19
  • Orchards: UCM-0523, UCM-0323, UCM-0822


  • <dataset>.json: Contains metadata descriptions for ROS bags and ground-truth data within this dataset
  • <dataset>-os-metadata.json: Sensor metadata for the LiDAR model used in this dataset’s collection


Input ROS bags contain raw data from all on-board sensors, including LiDAR, IMU, RGBD, GPS, and thermal camera data when available.

  • RAW-<dataset>M-##.bag: Raw data from all on-board sensors for the mobile sensor platform
  • RAW-<dataset>U-##.bag: Raw data from all on-board sensors for the Falcon 4 UAV platform


Processed ROS bags which contain ground-truth lidar-intertial odometry and velocity-corrected point cloud frames for every sweep of the LiDAR provided by Faster-LIO. We also provide semantically inferred ground and tree point clouds in the sensor and robot base frame, from our trained segmentation network based on RangeNet++.

  • <dataset>U-##.bag: Each processed ROS bag contains the following rostopics:
    • /Odometry: Ground-truth lidar-intertial odometry
    • /cloud_registered_body: Velocity-corrected point cloud frames
    • /os_node/segmented_point_cloud_no_destagger: Semantic inference cloud from our model based on RangeNet++
    • /tree_cloud and /tree_cloud_world: Inferred tree stem clouds in sensor and robot base frames, respectively
    • /ground_cloud: Inferred ground cloud
    • /ublox/fix and /ublox/velocity: GPS data
    • /tf and /tf_static: tf data

Inferred point cloud Inferred point cloud visualization of /os_node/segmented_point_cloud_no_destagger

Ground Truth


Manually annotated semantic labels for tree stems and ground points are provided for each data subset and model resolution platform. The labels are stored in HDF5 format, see our Github repo for instructions on how to convert them to numpy and 2D range image formats.

Semantic labels Annotated semantic tree stem label visualization


Field measurements (DBH provided for all, total tree heights and full diameter profiles provided for some trees) are stored in JSON files. The map PNG files show how the tree index names are correlated to trees in the JSON files and which ROS bags can be replayed to view the individual trees.

ground truth map JSON and map PNG files for VAT-0723M-01 ROS bag ground truth


For each tree with a ground-truth field measurement, accumulated individual tree point clouds are provided as extracted by our segmentation models.

cat dog

Individual tree point clouds are provided for diameter estimation algorithm testing


We share YAML files with baseline diameter estimation results from our two novel methods (DBCRE and SLOAM). See our Github repo for instructions on how to use run the provided benchmark scripts to evaluate estimation results.

Diameter estimation