Skip to content

YAML Profiles

A YAML profile fully describes the structure of a synchronous dataset -- which directories to traverse, how files map to keys, and how to cast raw bytes into tensors. Profiles live in apairo/dataset/profiles/.

A profile has two top-level sections: layers and modalities.


layers

layers is a list that describes the directory hierarchy from the dataset root down to individual files. Each element is either a plain string (layer type with no value) or a single-key mapping (layer type with a value).

layers:
  - fixed: sequences   # always "sequences/" at this depth
  - sequence           # one directory per recording session
  - modality:          # one directory per channel
      lidar: velodyne  # key "lidar" maps to folder "velodyne"
      labels: labels

Layer types

fixed

A literal folder name that is always present at this depth in the tree.

- fixed: sequences       # string value
- fixed: Rellis-3D

split

A partition directory. When split= is passed to the dataset constructor, only files whose path contains this directory name are returned. Omitting split= loads all partitions.

- split: [train, val, test]

Note

GOOSE has split at two levels (before and after the modality directory) because both lidar/train/ and labels/train/ exist independently.

sequence

Marks the depth at which individual recording sessions live. The system uses this position to determine where to look for .apairo sidecar files and how to compute derived file paths.

- sequence

modality

Marks the depth at which per-channel subdirectories live. Has two forms:

Plain string -- the folder name equals the key name:

- modality
# key "lidar" -> folder "lidar/"
# key "labels" -> folder "labels/"

Mapping -- explicit key -> folder renames:

- modality:
    lidar: velodyne         # key "lidar" -> folder "velodyne/"
    labels: labels          # key "labels" -> folder "labels/"

modalities

modalities is a mapping from key name to field spec. Each entry describes one loadable channel.

modalities:
  lidar:
    ext: .bin
    dtype: float32
    reshape: [-1, 4]
  labels:
    ext: .label
    dtype: int32
    mask: 65535
    torch_dtype: int64

Fields

Field Required Description
ext yes* File extension, with or without leading dot (.bin, bin, .label, .npy, .png, .jpg, .pt, .txt). Required for per-frame modalities; can be omitted for sequence-file loaders that imply their own extension.
dtype no NumPy dtype string used by np.fromfile for binary formats (.bin, .label). Not used for structured formats or sequence-file loaders.
reshape no List of ints passed to ndarray.reshape() after loading. Use [-1, 4] for (N, 4) point clouds, [3, 4] for a 3x4 pose matrix.
mask no Integer bitmask applied as arr & mask before dtype conversion. Use 65535 (0xFFFF) to strip SemanticKITTI instance bits.
torch_dtype no NumPy dtype name for a final .astype() cast -- e.g., int64 to convert int32 labels.
loader no Override the default loader. Per-frame: bin, npy, img, pt. Sequence-file: txt_rows (one row per frame in a single .txt file).
subpath no Extra path components below the modality directory. Use when there is an additional sub-folder between the channel directory and the files.
optional no If true, the key is silently skipped when absent from disk instead of raising FileNotFoundError. Default: false.

Loader selection

For binary formats (.bin, .label), apairo uses np.fromfile directly -- dtype, reshape, mask, and torch_dtype all apply.

For structured formats, the appropriate loader is selected by extension:

Extension Loader Notes
.npy npy np.load -> torch.from_numpy
.png, .jpg img torchvision.io.read_image -> torch.Tensor (CHW, uint8)
.pt pt torch.load(weights_only=True)

reshape, mask, and torch_dtype are applied after loading for structured formats as well.


Annotated examples

layers:
  - fixed: sequences    # root/sequences/ always present
  - sequence            # root/sequences/00/, root/sequences/01/, ...
  - modality:
      lidar: velodyne   # "lidar" key -> velodyne/ subdirectory
      labels: labels    # "labels" key -> labels/ subdirectory

modalities:
  lidar:
    ext: .bin
    dtype: float32
    reshape: [-1, 4]      # (N*4,) -> (N, 4): x, y, z, intensity
  labels:
    ext: .label
    dtype: int32
    mask: 65535           # 0xFFFF: strip upper 16 bits (instance ID)
    torch_dtype: int64    # labels must be int64 for nn.CrossEntropyLoss
layers:
  - split: [train, val, test]   # root/lidar/train/
  - modality                    # root/lidar/train/seq/ -- folder = key name
  - split: [train, val, test]   # root/labels/train/
  - sequence                    # root/labels/train/seq_001/

modalities:
  lidar:
    ext: .bin
    dtype: float32
    reshape: [-1, 4]
  labels:
    ext: .label
    dtype: int32
    torch_dtype: int64

Why two split layers?

GOOSE separates lidar and label data under independent trees: lidar/train/seq/ and labels/train/seq/. The split layer appears at depth 0 (before the modality directory) and at depth 2 (after the modality directory), mirroring both branches.

layers:
  - fixed: Rellis-3D            # mandatory top-level folder
  - sequence                    # Rellis-3D/00000/, Rellis-3D/00001/, ...
  - modality:
      lidar: os1_cloud_node_kitti_bin
      labels: os1_cloud_node_semantickitti_label_id

modalities:
  lidar:
    ext: .bin
    dtype: float32
    reshape: [-1, 4]
  labels:
    ext: .label
    dtype: int32
    torch_dtype: int64
  poses:
    ext: .txt
    loader: txt_rows    # single file per sequence, one row = one frame
    reshape: [3, 4]     # 12 floats -> 3x4 transformation matrix
    optional: true