FishTrack23: Underwater Tracking & Classification Dataset

Updated 21 January 2026

FishTrack23 is an underwater multi-object tracking and fish identification dataset featuring real-world videos with challenges like low lighting, color casts, and turbidity.
It employs detailed annotation protocols with 2D bounding boxes and tracking IDs across diverse fish taxa, facilitating rigorous evaluation for detection, tracking, and classification methods.
The dataset supports comprehensive metrics such as precision, recall, mAP, and MOTA, establishing a robust standard for aquatic perception in robotics and fisheries research.

FishTrack23 is an underwater multi-object tracking and fish identification dataset constructed from real-world marine video recordings. Designed to benchmark object detection, tracking, and fine-grained classification algorithms in challenging aquatic environments, FishTrack23 provides annotated video samples marked by severe visual degradation due to low lighting, significant color casts, and turbidity. The dataset is used extensively as an evaluation standard for machine learning-based aquatic perception modules, particularly in the context of robotics and automated fishery research applications (Silva et al., 14 Jan 2026).

1. Dataset Structure and Content

FishTrack23 is organized as an ensemble collection of underwater videos targeting multi-object tracking (MOT) in naturalistic, real-world settings. Video sources are rigorously filtered: sequences containing erroneous or duplicate annotations such as plants or fishing lures, or videos entirely lacking fish, are excluded a priori. From the curated set, annotated frames are subsampled for training and validation via a one-in-twenty sampling strategy, while all annotated frames are retained for testing, supplemented by a 5% subset of unannotated frames to facilitate continuous video reconstruction at inference time.

The dataset is enumerated as follows:

Split	# Images	Notes
Training	5,149	1 in 20 annotated frames
Validation	1,098	1 in 20 annotated frames
Testing	14,575	All annotated + 5% unannotated

Frame resolutions are non-uniform, inherited from original video sources. Annotation density differs by video, with original annotation sampling conducted at either 5 Hz or 10 Hz, yielding test footage durations between 23 and 46 minutes.

2. Environmental Parameters and Visual Challenges

FishTrack23 represents a spectrum of underwater visual degradations:

Lighting: Consistently low, with substantial red-channel attenuation yielding strong color casts.
Turbidity: High particulate matter induces scattered light and further visual distortion.
Variability: The dataset spans numerous environmental conditions, which are not quantified in detail (e.g., exact lux values or scattering coefficients are not reported).

This variety is intended to reflect operational challenges in marine robotics and fish monitoring. No quantitative measures of lighting or turbidity are provided in the public description (Silva et al., 14 Jan 2026).

3. Annotation Protocol and Classes

Object Detection & Tracking

Fish detection and tracking annotations consist of two components applied to every frame containing fish:

2D Bounding Boxes: Axis-aligned rectangles demarcate the spatial extent of each observed fish.
Tracking IDs: Each fish receives a unique identifier, stable across subsequent frames, to enable multi-object tracking evaluations.

Annotation storage is inferred to follow the FishTrack23 official release, typically per-frame lists—either in text or JSON—comprising $(x, y, w, h, \mathrm{track\_id})$ tuples.

Species Classification

Original video annotations identify 73 distinct fish taxa, but to address class imbalance, fish are grouped into the following labels:

Lutjanus campechanus
Micropterus salmoides
Pagrus pagrus
"unspecified fish" (aggregating the remaining 70 taxa)

Each bounding box is cropped per frame to form a single-specimen image, used for standalone species classification tasks. Associated metadata includes a (class label, crop image ID) pair; no instance segmentation masks are available.

4. Split Protocols and Usage Scenarios

Detection and Tracking

Training: 5,149 images (annotated frames, subsampled 1 in 20)
Validation: 1,098 images (same protocol as training)
Testing: 14,575 images (full set of annotated and 5% unannotated frames; inference reconstructs the corresponding test video)
Benchmark Algorithms: Evaluations are reported on YOLOv8m, YOLOv10s for detection and ByteTrack for multi-object tracking.

Classification

Cropped fish images from detection bounding boxes are divided:

Training: 80% of cropped instances from the union of training and validation splits
Validation: 20% remainder of same pool
Testing: Crops generated from all test frames
Classifier Backbone: YOLOv11s-cls

No cross-validation methodology is reported. Detection is evaluated using static images; tracking is assessed on reconstructed videos via the TrackEval toolkit.

5. Evaluation Metrics

FishTrack23 employs both standard detection and MOT evaluation criteria:

Object Detection

Precision: $P = \frac{TP}{TP + FP}$
Recall: $R = \frac{TP}{TP + FN}$
F1-Score: $F_1 = \frac{2PR}{P + R}$
[email protected]: Mean Average Precision at IoU threshold 0.5
[email protected]:0.95: Mean AP averaged over IoU thresholds in $\{0.50:0.05:0.95\}$

Tracking

MOTA: Multiple Object Tracking Accuracy,

$\mathrm{MOTA} = 1 - \frac{\sum_t (FN_t + FP_t + IDSW_t)}{\sum_t GT_t}$

HOTA: Higher Order Tracking Accuracy (aggregates detection and association performance)
IDF1:

$\mathrm{IDF}_1 = \frac{2 \cdot IDTP}{2 \cdot IDTP + IDFP + IDFN}$

FishTrack23 benchmarking aligns with established COCO and multi-object tracking conventions.

6. Preprocessing and Data Handling

FishTrack23 emphasizes feature fidelity over superficial image enhancement. Frame preprocessing incorporates the following:

Color Correction: A non-trainable white-balance operation adjusts RGB channel means to the median, standardizing illumination and color balance.
Frame Sampling: Annotation-driven, as detailed previously.
Task-specific Cropping: Classification utilizes bounding box crops; detection/tracking models operate on native full-frame images.
Augmentation: No geometric (flips, rotations) or photometric (jitter, noise) augmentations are reported for FishTrack23. Enhancement is executed in-network by the AquaFeat+ pipeline, comprising color correction and hierarchical feature enhancement modules.

A plausible implication is that improvement in perception tasks on FishTrack23 is attributable not to external data augmentation but to deep end-to-end enhancement methods within the perception model itself.

7. Significance and Applications

FishTrack23 is positioned as a rigorous testbed for underwater vision systems operating under non-ideal, real-world visual regimes. Its use in the AquaFeat+ project demonstrates its centrality for evaluating algorithms intended for deployment in robotic inspection, fisheries monitoring, and marine biodiversity assessment scenarios. The dataset’s annotation density, challenging conditions, and structured protocols support comparative benchmarking across object detection, tracking, and classification domains, corresponding to practical requirements of aquatic perception modules in both research and applied contexts (Silva et al., 14 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

AquaFeat+: an Underwater Vision Learning-based Enhancement Method for Object Detection, Classification, and Tracking (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to FishTrack23 Dataset.

FishTrack23: Underwater Tracking & Classification Dataset

1. Dataset Structure and Content

2. Environmental Parameters and Visual Challenges

3. Annotation Protocol and Classes

Object Detection & Tracking

Species Classification

4. Split Protocols and Usage Scenarios

Detection and Tracking

Classification

5. Evaluation Metrics

Object Detection

Tracking

6. Preprocessing and Data Handling

7. Significance and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

FishTrack23: Underwater Tracking & Classification Dataset

1. Dataset Structure and Content

2. Environmental Parameters and Visual Challenges

3. Annotation Protocol and Classes

Object Detection & Tracking

Species Classification

4. Split Protocols and Usage Scenarios

Detection and Tracking

Classification

5. Evaluation Metrics

Object Detection

Tracking

6. Preprocessing and Data Handling

7. Significance and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research