Process Datasets
With Supervision, you can load and manipulate classification, object detection, and segmentation datasets. This tutorial will walk you through how to load, split, merge, visualize, and augment datasets in Supervision.
Download Dataset¶
In this tutorial, we will use a dataset from Roboflow Universe, a public repository of thousands of computer vision datasets. If you already have your dataset in COCO, YOLO, or Pascal VOC format, you can skip this section.
Next, log into your Roboflow account and download the dataset of your choice in the COCO, YOLO, or Pascal VOC format. You can customize the following code snippet with your workspace ID, project ID, and version number.
Load Dataset¶
The Supervision library provides convenient functions to load datasets in various
formats. If your dataset is already split into train, test, and valid subsets, you can
load each of those as separate sv.DetectionDataset
instances.
We can do so using the sv.DetectionDataset.from_coco
to load annotations in COCO format.
import supervision as sv
ds_train = sv.DetectionDataset.from_coco(
images_directory_path=f'{dataset.location}/train',
annotations_path=f'{dataset.location}/train/_annotations.coco.json',
)
ds_valid = sv.DetectionDataset.from_coco(
images_directory_path=f'{dataset.location}/valid',
annotations_path=f'{dataset.location}/valid/_annotations.coco.json',
)
ds_test = sv.DetectionDataset.from_coco(
images_directory_path=f'{dataset.location}/test',
annotations_path=f'{dataset.location}/test/_annotations.coco.json',
)
ds_train.classes
# ['person', 'bicycle', 'car', ...]
len(ds_train), len(ds_valid), len(ds_test)
# 800, 100, 100
We can do so using the sv.DetectionDataset.from_yolo
to load annotations in YOLO format.
import supervision as sv
ds_train = sv.DetectionDataset.from_yolo(
images_directory_path=f'{dataset.location}/train/images',
annotations_directory_path=f'{dataset.location}/train/labels',
data_yaml_path=f'{dataset.location}/data.yaml'
)
ds_valid = sv.DetectionDataset.from_yolo(
images_directory_path=f'{dataset.location}/valid/images',
annotations_directory_path=f'{dataset.location}/valid/labels',
data_yaml_path=f'{dataset.location}/data.yaml'
)
ds_test = sv.DetectionDataset.from_yolo(
images_directory_path=f'{dataset.location}/test/images',
annotations_directory_path=f'{dataset.location}/test/labels',
data_yaml_path=f'{dataset.location}/data.yaml'
)
ds_train.classes
# ['person', 'bicycle', 'car', ...]
len(ds_train), len(ds_valid), len(ds_test)
# 800, 100, 100
We can do so using the sv.DetectionDataset.from_pascal_voc
to load annotations in Pascal VOC format.
import supervision as sv
ds_train = sv.DetectionDataset.from_pascal_voc(
images_directory_path=f'{dataset.location}/train/images',
annotations_directory_path=f'{dataset.location}/train/labels'
)
ds_valid = sv.DetectionDataset.from_pascal_voc(
images_directory_path=f'{dataset.location}/valid/images',
annotations_directory_path=f'{dataset.location}/valid/labels'
)
ds_test = sv.DetectionDataset.from_pascal_voc(
images_directory_path=f'{dataset.location}/test/images',
annotations_directory_path=f'{dataset.location}/test/labels'
)
ds_train.classes
# ['person', 'bicycle', 'car', ...]
len(ds_train), len(ds_valid), len(ds_test)
# 800, 100, 100
Split Dataset¶
If your dataset is not already split into train, test, and valid subsets, you can
easily do so using the sv.DetectionDataset.split
method. We can split it as follows, ensuring a random shuffle of the data.
import supervision as sv
ds = sv.DetectionDataset(...)
len(ds)
# 1000
ds_train, ds = ds.split(split_ratio=0.8, shuffle=True)
ds_valid, ds_test = ds.split(split_ratio=0.5, shuffle=True)
len(ds_train), len(ds_valid), len(ds_test)
# 800, 100, 100
Merge Dataset¶
If you have multiple datasets that you would like to merge, you can do so using the
sv.DetectionDataset.merge
method.
import supervision as sv
ds_train = sv.DetectionDataset.from_coco(
images_directory_path=f'{dataset.location}/train',
annotations_path=f'{dataset.location}/train/_annotations.coco.json',
)
ds_valid = sv.DetectionDataset.from_coco(
images_directory_path=f'{dataset.location}/valid',
annotations_path=f'{dataset.location}/valid/_annotations.coco.json',
)
ds_test = sv.DetectionDataset.from_coco(
images_directory_path=f'{dataset.location}/test',
annotations_path=f'{dataset.location}/test/_annotations.coco.json',
)
ds_train.classes
# ['person', 'bicycle', 'car', ...]
len(ds_train), len(ds_valid), len(ds_test)
# 800, 100, 100
ds = sv.DetectionDataset.merge([ds_train, ds_valid, ds_test])
ds.classes
# ['person', 'bicycle', 'car', ...]
len(ds)
# 1000
import supervision as sv
ds_train = sv.DetectionDataset.from_yolo(
images_directory_path=f'{dataset.location}/train/images',
annotations_directory_path=f'{dataset.location}/train/labels',
data_yaml_path=f'{dataset.location}/data.yaml'
)
ds_valid = sv.DetectionDataset.from_yolo(
images_directory_path=f'{dataset.location}/valid/images',
annotations_directory_path=f'{dataset.location}/valid/labels',
data_yaml_path=f'{dataset.location}/data.yaml'
)
ds_test = sv.DetectionDataset.from_yolo(
images_directory_path=f'{dataset.location}/test/images',
annotations_directory_path=f'{dataset.location}/test/labels',
data_yaml_path=f'{dataset.location}/data.yaml'
)
ds_train.classes
# ['person', 'bicycle', 'car', ...]
len(ds_train), len(ds_valid), len(ds_test)
# 800, 100, 100
ds = sv.DetectionDataset.merge([ds_train, ds_valid, ds_test])
ds.classes
# ['person', 'bicycle', 'car', ...]
len(ds)
# 1000
import supervision as sv
ds_train = sv.DetectionDataset.from_pascal_voc(
images_directory_path=f'{dataset.location}/train/images',
annotations_directory_path=f'{dataset.location}/train/labels'
)
ds_valid = sv.DetectionDataset.from_pascal_voc(
images_directory_path=f'{dataset.location}/valid/images',
annotations_directory_path=f'{dataset.location}/valid/labels'
)
ds_test = sv.DetectionDataset.from_pascal_voc(
images_directory_path=f'{dataset.location}/test/images',
annotations_directory_path=f'{dataset.location}/test/labels'
)
ds_train.classes
# ['person', 'bicycle', 'car', ...]
len(ds_train), len(ds_valid), len(ds_test)
# 800, 100, 100
ds = sv.DetectionDataset.merge([ds_train, ds_valid, ds_test])
ds.classes
# ['person', 'bicycle', 'car', ...]
len(ds)
# 1000
Iterate over Dataset¶
There are two ways to loop over a sv.DetectionDataset
: using a direct
for loop
called on the sv.DetectionDataset
instance or loading sv.DetectionDataset
entries
by index.
import supervision as sv
ds = sv.DetectionDataset(...)
# Option 1
for image_path, image, annotations in ds:
... # Process each image and its annotations
# Option 2
for idx in range(len(ds)):
image_path, image, annotations = ds[idx]
... # Process the image and annotations at index `idx`
Visualize Dataset¶
The Supervision library provides tools for easily visualizing your detection dataset.
You can create a grid of annotated images to quickly inspect your data and labels.
First, initialize the sv.BoxAnnotator
and sv.LabelAnnotator
.
Then, iterate through a subset of the dataset (e.g., the first 25 images), drawing
bounding boxes and class labels on each image. Finally, combine the annotated images
into a grid for display.
import supervision as sv
ds = sv.DetectionDataset(...)
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
annotated_images = []
for i in range(16):
_, image, annotations = ds[i]
labels = [ds.classes[class_id] for class_id in annotations.class_id]
annotated_image = image.copy()
annotated_image = box_annotator.annotate(annotated_image, annotations)
annotated_image = label_annotator.annotate(annotated_image, annotations, labels)
annotated_images.append(annotated_image)
grid = sv.create_tiles(
annotated_images,
grid_size=(4, 4),
single_tile_size=(400, 400),
tile_padding_color=sv.Color.WHITE,
tile_margin_color=sv.Color.WHITE
)
Save Dataset¶
We can do so using the sv.DetectionDataset.as_coco
method to save annotations in COCO format.
We can do so using the sv.DetectionDataset.as_yolo
method to save annotations in YOLO format.
We can do so using the sv.DetectionDataset.as_pascal_voc
method to save annotations in Pascal VOC format.
Augment Dataset¶
In this section, we'll explore using Supervision in combination with Albumentations to augment our dataset. Data augmentation is a common technique in computer vision to increase the size and diversity of training datasets, leading to improved model performance and generalization.
Albumentations provides a flexible and powerful API for image augmentation. The core of
the library is the Compose
class, which allows you to chain multiple image transformations together. Each
transformation is defined using a dedicated class, such as
HorizontalFlip
,
RandomBrightnessContrast
,
or Perspective
.
import albumentations as A
augmentation = A.Compose(
transforms=[
A.Perspective(p=0.1),
A.HorizontalFlip(p=0.5),
A.RandomBrightnessContrast(p=0.5)
],
bbox_params=A.BboxParams(
format='pascal_voc',
label_fields=['category']
),
)
The key is to set format='pascal_voc'
, which corresponds to the
[x_min, y_min, x_max, y_max]
bounding box format used in Supervision.
import numpy as np
import supervision as sv
from dataclasses import replace
ds = sv.DetectionDataset(...)
_, original_image, original_annotations = ds[0]
output = augmentation(
image=original_image,
bboxes=original_annotations.xyxy,
category=original_annotations.class_id
)
augmented_image = output['image']
augmented_annotations = replace(
original_annotations,
xyxy=np.array(output['bboxes']),
class_id=np.array(output['category'])
)