YAML Profiles

A YAML profile fully describes the structure of a synchronous dataset -- which directories to traverse, how files map to keys, and how to cast raw bytes into tensors. Profiles live in apairo/dataset/profiles/.

A profile has two top-level sections: layers and modalities.

`layers`

layers is a list that describes the directory hierarchy from the dataset root down to individual files. Each element is either a plain string (layer type with no value) or a single-key mapping (layer type with a value).

layers:
  - fixed: sequences   # always "sequences/" at this depth
  - sequence           # one directory per recording session
  - modality:          # one directory per channel
      lidar: velodyne  # key "lidar" maps to folder "velodyne"
      labels: labels

Layer types

`fixed`

A literal folder name that is always present at this depth in the tree.

- fixed: sequences       # string value
- fixed: Rellis-3D

`split`

A partition directory. When split= is passed to the dataset constructor, only files whose path contains this directory name are returned. Omitting split= loads all partitions.

- split: [train, val, test]

Note

GOOSE has split at two levels (before and after the modality directory) because both lidar/train/ and labels/train/ exist independently.

`sequence`

Marks the depth at which individual recording sessions live. The system uses this position to determine where to look for .apairo sidecar files and how to compute derived file paths.

- sequence

`modality`

Marks the depth at which per-channel subdirectories live. Has two forms:

Plain string -- the folder name equals the key name:

- modality
# key "lidar" -> folder "lidar/"
# key "labels" -> folder "labels/"

Mapping -- explicit key -> folder renames:

- modality:
    lidar: velodyne         # key "lidar" -> folder "velodyne/"
    labels: labels          # key "labels" -> folder "labels/"

`modalities`

modalities is a mapping from key name to field spec. Each entry describes one loadable channel.

modalities:
  lidar:
    ext: .bin
    dtype: float32
    reshape: [-1, 4]
  labels:
    ext: .label
    dtype: int32
    mask: 65535
    torch_dtype: int64

Fields

Field	Required	Description
`ext`	yes*	File extension, with or without leading dot (`.bin`, `bin`, `.label`, `.npy`, `.png`, `.jpg`, `.pt`, `.txt`). Required for per-frame modalities; can be omitted for sequence-file loaders that imply their own extension.
`dtype`	no	NumPy dtype string used by `np.fromfile` for binary formats (`.bin`, `.label`). Not used for structured formats or sequence-file loaders.
`reshape`	no	List of ints passed to `ndarray.reshape()` after loading. Use `[-1, 4]` for `(N, 4)` point clouds, `[3, 4]` for a 3x4 pose matrix.
`mask`	no	Integer bitmask applied as `arr & mask` before dtype conversion. Use `65535` (`0xFFFF`) to strip SemanticKITTI instance bits.
`torch_dtype`	no	NumPy dtype name for a final `.astype()` cast -- e.g., `int64` to convert `int32` labels.
`loader`	no	Override the default loader. Per-frame: `bin`, `npy`, `img`, `pt`. Sequence-file: `txt_rows` (one row per frame in a single `.txt` file).
`subpath`	no	Extra path components below the modality directory. Use when there is an additional sub-folder between the channel directory and the files.
`optional`	no	If `true`, the key is silently skipped when absent from disk instead of raising `FileNotFoundError`. Default: `false`.

Loader selection

For binary formats (.bin, .label), apairo uses np.fromfile directly -- dtype, reshape, mask, and torch_dtype all apply.

For structured formats, the appropriate loader is selected by extension:

Extension	Loader	Notes
`.npy`	`npy`	`np.load` -> `torch.from_numpy`
`.png`, `.jpg`	`img`	`torchvision.io.read_image` -> `torch.Tensor` (CHW, uint8)
`.pt`	`pt`	`torch.load(weights_only=True)`

reshape, mask, and torch_dtype are applied after loading for structured formats as well.

Annotated examples

SemanticKITTIGOOSERellis-3D

layers:
  - fixed: sequences    # root/sequences/ always present
  - sequence            # root/sequences/00/, root/sequences/01/, ...
  - modality:
      lidar: velodyne   # "lidar" key -> velodyne/ subdirectory
      labels: labels    # "labels" key -> labels/ subdirectory

modalities:
  lidar:
    ext: .bin
    dtype: float32
    reshape: [-1, 4]      # (N*4,) -> (N, 4): x, y, z, intensity
  labels:
    ext: .label
    dtype: int32
    mask: 65535           # 0xFFFF: strip upper 16 bits (instance ID)
    torch_dtype: int64    # labels must be int64 for nn.CrossEntropyLoss

layers:
  - split: [train, val, test]   # root/lidar/train/
  - modality                    # root/lidar/train/seq/ -- folder = key name
  - split: [train, val, test]   # root/labels/train/
  - sequence                    # root/labels/train/seq_001/

modalities:
  lidar:
    ext: .bin
    dtype: float32
    reshape: [-1, 4]
  labels:
    ext: .label
    dtype: int32
    torch_dtype: int64

Why two split layers?

GOOSE separates lidar and label data under independent trees: lidar/train/seq/ and labels/train/seq/. The split layer appears at depth 0 (before the modality directory) and at depth 2 (after the modality directory), mirroring both branches.

layers:
  - fixed: Rellis-3D            # mandatory top-level folder
  - sequence                    # Rellis-3D/00000/, Rellis-3D/00001/, ...
  - modality:
      lidar: os1_cloud_node_kitti_bin
      labels: os1_cloud_node_semantickitti_label_id

modalities:
  lidar:
    ext: .bin
    dtype: float32
    reshape: [-1, 4]
  labels:
    ext: .label
    dtype: int32
    torch_dtype: int64
  poses:
    ext: .txt
    loader: txt_rows    # single file per sequence, one row = one frame
    reshape: [3, 4]     # 12 floats -> 3x4 transformation matrix
    optional: true