Data format for single‐cell representation learning

The training data for single cell representation learning consists of images and tracking results. Specifically, viscy.data.triplet.TripletDataModule requires the data formats described as follows.

Images

The images should be stored in HCS OME-Zarr v0.4 format. See iohub documentation for instructions to write them. An example dataset can be found here.

Tracking

Tracking is done per-FOV with Ultrack, which produces segmentation results as arrays, and tracking results as tables.

VisCy expects an HCS OME-Zarr store with additional metadata, where the arrays are segmentation labels with FOV names consistent with the image arrays, and each FOV should have the tracking table in a CSV file at the same level as the FOV metadata. The directory tree should look like this:

tracks.zarr
├── 0
│   ├── 3
│   │   ├── 000002
│   │   │   ├── 0
│   │   │   ├── tracks_0_3_000002.csv
│   │   │   ├── .zattrs
│   │   │   └── .zgroup
│   │   ├── 001000
│   │   │   ├── 0
│   │   │   ├── tracks_0_3_001000.csv
│   │   │   ├── .zattrs
│   │   │   └── .zgroup
│   │   ├── .zattrs
│   │   └── .zgroup
│   ├── 6
│   │   ├── 000002
│   │   │   ├── 0
│   │   │   ├── tracks_0_6_000002.csv
│   │   │   ├── .zattrs
│   │   │   └── .zgroup
│   │   ├── 001000
│   │   │   ├── 0
│   │   │   ├── tracks_0_6_001000.csv
│   │   │   ├── .zattrs
│   │   │   └── .zgroup
│   │   ├── .zattrs
│   │   └── .zgroup
│   └── .zgroup
├── .zattrs
└── .zgroup

An example dataset can be found here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data format for single‐cell representation learning

Images

Tracking

Clone this wiki locally