Supervision Quickstart¶
We write your reusable computer vision tools. Whether you need to load your dataset from your hard drive, draw detections on an image or video, or count how many detections are in a zone. You can count on us! 🤝
We hope that the resources in this notebook will help you get the most out of Supervision. Please browse the Supervision docs for details, raise an issue on GitHub for support, and join our discussions section for questions!
Table of contents¶
- Before you start
- Install
- Detection API
- Plug in your model
- YOLOv8 (
pip install ultralytics
) - Inference (
pip install inference
) - YOLO-NAS (
pip install super-gradients
)
- YOLOv8 (
- Annotate
BoxAnnotator
MaskAnnotator
LabelAnnotator
- Filter
- By index, index list and index slice
- By
class_id
- By
confidence
- By advanced logical condition
- Plug in your model
- Video API
VideoInfo
get_video_frames_generator
VideoSink
- Dataset API
DetectionDataset.from_yolo
- Visualize annotations
split
DetectionDataset.as_pascal_voc
⚡ Before you start¶
NOTE: In this notebook, we aim to show - among other things - how simple it is to integrate supervision
with popular object detection and instance segmentation libraries and frameworks. GPU access is optional but will certainly make the ride smoother.
Let's make sure that we have access to GPU. We can use nvidia-smi
command to do that. In case of any problems navigate to Edit
-> Notebook settings
-> Hardware accelerator
, set it to GPU
, and then click Save
.
!nvidia-smi
Wed Jul 17 14:51:30 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA L4 Off | 00000000:00:03.0 Off | 0 | | N/A 63C P8 14W / 72W | 1MiB / 23034MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+
NOTE: To make it easier for us to manage datasets, images and models we create a HOME
constant.
import os
HOME = os.getcwd()
print(HOME)
/content
NOTE: During our demo, we will need some example images.
!mkdir {HOME}/images
NOTE: Feel free to use your images. Just make sure to put them into images
directory that we just created. ☝️
%cd {HOME}/images
!wget -q https://media.roboflow.com/notebooks/examples/dog.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-2.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-3.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-4.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-5.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-6.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-7.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-8.jpeg
/content/images
💻 Install¶
!pip install -q supervision
import supervision as sv
print(sv.__version__)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/135.7 kB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 135.7/135.7 kB 3.9 MB/s eta 0:00:00 0.22.0
👁️ Detection API¶
- xyxy
(np.ndarray)
: An array of shape(n, 4)
containing the bounding boxes coordinates in format[x1, y1, x2, y2]
- mask:
(Optional[np.ndarray])
: An array of shape(n, W, H)
containing the segmentation masks. - confidence
(Optional[np.ndarray])
: An array of shape(n,)
containing the confidence scores of the detections. - class_id
(Optional[np.ndarray])
: An array of shape(n,)
containing the class ids of the detections. - tracker_id
(Optional[np.ndarray])
: An array of shape(n,)
containing the tracker ids of the detections.
🔌 Plug in your model¶
NOTE: In our example, we will focus only on integration with YOLO-NAS and YOLOv8. However, keep in mind that supervision allows seamless integration with many other models like SAM, Transformers, and YOLOv5. You can learn more from our documentation.
import cv2
IMAGE_PATH = f"{HOME}/images/dog.jpeg"
image = cv2.imread(IMAGE_PATH)
!pip install -q "ultralytics<=8.3.40"
from ultralytics import YOLO
model = YOLO("yolov8s.pt")
result = model(image, verbose=False)[0]
detections = sv.Detections.from_ultralytics(result)
"detections", len(detections)
('detections', 4)
!pip install -q inference
from inference import get_model
model = get_model(model_id="yolov8s-640")
result = model.infer(image)[0]
detections = sv.Detections.from_inference(result)
"detections", len(detections)
('detections', 4)
!pip install -q super-gradients
!pip install --upgrade urllib3
from super_gradients.training import models
model = models.get("yolo_nas_s", pretrained_weights="coco")
result = model.predict(image)
detections = sv.Detections.from_yolo_nas(result)
"detections", len(detections)
('detections', 7)
👩🎨 Annotate¶
from ultralytics import YOLO
model = YOLO("yolov8x.pt")
result = model(image, verbose=False)[0]
detections = sv.Detections.from_ultralytics(result)
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
annotated_image = image.copy()
annotated_image = box_annotator.annotate(annotated_image, detections=detections)
annotated_image = label_annotator.annotate(annotated_image, detections=detections)
sv.plot_image(image=annotated_image, size=(8, 8))
NOTE: By default sv.LabelAnnotator
use corresponding class_id
as label, however, the labels can have arbitrary format.
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
labels = [
f"{model.model.names[class_id]} {confidence:.2f}"
for class_id, confidence in zip(detections.class_id, detections.confidence)
]
annotated_image = image.copy()
annotated_image = box_annotator.annotate(annotated_image, detections=detections)
annotated_image = label_annotator.annotate(
annotated_image, detections=detections, labels=labels)
sv.plot_image(image=annotated_image, size=(8, 8))
from ultralytics import YOLO
model = YOLO("yolov8x-seg.pt")
result = model(image, verbose=False)[0]
detections = sv.Detections.from_ultralytics(result)
mask_annotator = sv.MaskAnnotator()
annotated_image = image.copy()
annotated_image = mask_annotator.annotate(annotated_image, detections=detections)
sv.plot_image(image=annotated_image, size=(8, 8))
By index, index list and index slice¶
NOTE: sv.Detections
filter API allows you to access detections by index, index list or index slice
detections_index = detections[0]
detections_index_list = detections[[0, 1, 3]]
detections_index_slice = detections[:2]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
images = []
for d in [detections_index, detections_index_list, detections_index_slice]:
annotated_image = box_annotator.annotate(image.copy(), detections=d)
annotated_image = label_annotator.annotate(annotated_image, detections=d)
images.append(annotated_image)
titles = [
"by index - detections[0]",
"by index list - detections[[0, 1, 3]]",
"by index slice - detections[:2]",
]
sv.plot_images_grid(images=images, titles=titles, grid_size=(1, 3))
By class_id¶
NOTE: Let's use sv.Detections
filter API to display only objects with class_id == 0
detections_filtered = detections[detections.class_id == 0]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
annotated_image = box_annotator.annotate(image.copy(), detections=detections_filtered)
annotated_image = label_annotator.annotate(
annotated_image, detections=detections_filtered
)
sv.plot_image(image=annotated_image, size=(8, 8))
By confidence¶
NOTE: Let's use sv.Detections
filter API to display only objects with confidence > 0.7
detections_filtered = detections[detections.confidence > 0.7]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
labels = []
for class_id, confidence in zip(
detections_filtered.class_id, detections_filtered.confidence
):
labels.append(f"{model.model.names[class_id]} {confidence:.2f}")
annotated_image = box_annotator.annotate(
image.copy(),
detections=detections_filtered,
)
annotated_image = label_annotator.annotate(
annotated_image, detections=detections_filtered, labels=labels
)
sv.plot_image(image=annotated_image, size=(8, 8))
By advanced logical condition¶
NOTE: Let's use sv.Detections
filter API allows you to build advanced logical conditions. Let's select only detections with class_id != 0
and confidence > 0.7
.
detections_filtered = detections[
(detections.class_id != 0) & (detections.confidence > 0.7)
]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
labels = [
f"{class_id} {confidence:.2f}"
for class_id, confidence in zip(
detections_filtered.class_id, detections_filtered.confidence
)
]
annotated_image = box_annotator.annotate(
image.copy(),
detections=detections_filtered,
)
annotated_image = label_annotator.annotate(
annotated_image, detections=detections_filtered, labels=labels
)
sv.plot_image(image=annotated_image, size=(8, 8))
🎬 Video API¶
NOTE: supervision
offers a lot of utils to make working with videos easier. Let's take a look at some of them.
NOTE: During our demo, we will need some example videos.
!pip install -q supervision[assets]
!mkdir {HOME}/videos
NOTE: Feel free to use your videos. Just make sure to put them into videos
directory that we just created. ☝️
%cd {HOME}/videos
from supervision.assets import download_assets, VideoAssets
download_assets(VideoAssets.VEHICLES)
VIDEO_PATH = VideoAssets.VEHICLES.value
sv.VideoInfo.from_video_path(video_path=VIDEO_PATH)
VideoInfo(width=3840, height=2160, fps=25, total_frames=538)
frame_generator = sv.get_video_frames_generator(source_path=VIDEO_PATH)
frame = next(iter(frame_generator))
sv.plot_image(image=frame, size=(8, 8))
RESULT_VIDEO_PATH = f"{HOME}/videos/vehicle-counting-result.mp4"
NOTE: Note that this time we have given a custom value for the stride
parameter equal to 2
. As a result, get_video_frames_generator
will return us every second video frame.
video_info = sv.VideoInfo.from_video_path(video_path=VIDEO_PATH)
with sv.VideoSink(target_path=RESULT_VIDEO_PATH, video_info=video_info) as sink:
for frame in sv.get_video_frames_generator(source_path=VIDEO_PATH, stride=2):
sink.write_frame(frame=frame)
NOTE: If we once again use VideoInfo
we will notice that the final video has 2 times fewer frames.
sv.VideoInfo.from_video_path(video_path=RESULT_VIDEO_PATH)
VideoInfo(width=3840, height=2160, fps=25, total_frames=269)
🖼️ Dataset API¶
NOTE: In order to demonstrate the capabilities of the Dataset API, we need a dataset. Let's download one from Roboflow Universe. To do this we first need to install the roboflow
pip package.
!pip install -q roboflow
!mkdir {HOME}/datasets
%cd {HOME}/datasets
import roboflow
from roboflow import Roboflow
roboflow.login()
rf = Roboflow()
project = rf.workspace("roboflow-jvuqo").project("fashion-assistant-segmentation")
dataset = project.version(5).download("yolov8")
/content/datasets/images/datasets visit https://app.roboflow.com/auth-cli to get your authentication token. Paste the authentication token here: ·········· loading Roboflow workspace... loading Roboflow project... Dependency ultralytics==8.0.196 is required but found version=8.2.54, to fix: `pip install ultralytics==8.0.196`
Downloading Dataset Version Zip in fashion-assistant-segmentation-5 to yolov8:: 100%|██████████| 122509/122509 [00:03<00:00, 37319.95it/s] Extracting Dataset Version Zip to fashion-assistant-segmentation-5 in yolov8:: 15%|█▍ | 187/1254 [00:00<00:00, 1860.52it/s]
Extracting Dataset Version Zip to fashion-assistant-segmentation-5 in yolov8:: 30%|██▉ | 374/1254 [00:00<00:00, 1609.45it/s] Extracting Dataset Version Zip to fashion-assistant-segmentation-5 in yolov8:: 43%|████▎ | 538/1254 [00:00<00:00, 1529.93it/s]
ds = sv.DetectionDataset.from_yolo(
images_directory_path=f"{dataset.location}/train/images",
annotations_directory_path=f"{dataset.location}/train/labels",
data_yaml_path=f"{dataset.location}/data.yaml",
)
Extracting Dataset Version Zip to fashion-assistant-segmentation-5 in yolov8:: 79%|███████▉ | 989/1254 [00:00<00:00, 2606.19it/s] Extracting Dataset Version Zip to fashion-assistant-segmentation-5 in yolov8:: 100%|██████████| 1254/1254 [00:00<00:00, 2505.30it/s]
len(ds)
573
ds.classes
['baseball cap', 'hoodie', 'jacket', 'pants', 'shirt', 'shorts', 'sneaker', 'sunglasses', 'sweatshirt', 't-shirt']
🏷️ Visualize annotations¶
IMAGE_NAME = list(ds.images.keys())[0]
image = ds.images[IMAGE_NAME]
annotations = ds.annotations[IMAGE_NAME]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
mask_annotator = sv.MaskAnnotator()
labels = [f"{ds.classes[class_id]}" for class_id in annotations.class_id]
annotated_image = mask_annotator.annotate(image.copy(), detections=annotations)
annotated_image = box_annotator.annotate(annotated_image, detections=annotations)
annotated_image = label_annotator.annotate(
annotated_image, detections=annotations, labels=labels
)
sv.plot_image(image=annotated_image, size=(8, 8))
ds_train, ds_test = ds.split(split_ratio=0.8)
"ds_train", len(ds_train), "ds_test", len(ds_test)
('ds_train', 458, 'ds_test', 115)
ds_train.as_pascal_voc(
images_directory_path=f"{HOME}/datasets/result/images",
annotations_directory_path=f"{HOME}/datasets/result/labels",
)