Serialise Detections to a JSON File¶
This cookbook introduce sv.JSONSink tool designed to write captured object detection data to file from video streams/file
Click the Open in Colab
button to run the cookbook on Google Colab.
!pip install -q inference requests tqdm supervision==0.21.0
import json
from typing import List
from collections import defaultdict
import numpy as np
import pandas as pd
import supervision as sv
from supervision.assets import download_assets, VideoAssets
from inference import InferencePipeline
from inference.core.interfaces.camera.entities import VideoFrame
The parameters defined below are:
SOURCE_VIDEO_PATH
- the path to the input videoCONFIDENCE_THRESHOLD
- do not include detections below this confidence levelIOU_THRESHOLD
- discard detections that overlap with others by more than this IOU ratioFILE_NAME
- write the json output to this fileINFERENCE_MODEL
- model id. This cookbook uses a model alias, but it can also be a fine-tuned model or a model from the Universe.
SOURCE_VIDEO_PATH = download_assets(VideoAssets.PEOPLE_WALKING)
CONFIDENCE_THRESHOLD = 0.3
IOU_THRESHOLD = 0.7
FILE_NAME = "detections.json"
INFERENCE_MODEL = "yolov8n-640"
people-walking.mp4 asset download complete.
As a result of executing the above download_assets(VideoAssets.PEOPLE_WALKING)
, you will download a video file and save it at the SOURCE_VIDEO_PATH
. Keep in mind that the video preview below works only in the web version of the cookbooks and not in Google Colab.
Read single frame from video¶
The get_video_frames_generator
enables us to easily iterate over video frames. Let's create a video generator for our sample input file and display its first frame on the screen.
generator = sv.get_video_frames_generator(SOURCE_VIDEO_PATH)
frame = next(generator)
sv.plot_image(frame, (12, 12))
We can also use VideoInfo.from_video_path
to learn basic information about our video, such as duration, resolution, or FPS.
sv.VideoInfo.from_video_path(SOURCE_VIDEO_PATH)
VideoInfo(width=1920, height=1080, fps=25, total_frames=341)
byte_track = sv.ByteTrack(minimum_consecutive_frames=3)
byte_track.reset()
Initialize sv.JSONSink¶
To save detections to a JSON
file, open our sv.JSONSink
and then pass the sv.Detections
object resulting from the inference to it.
Note that empty detections will be skipped.
json_sink = sv.JSONSink(FILE_NAME)
json_sink.open()
Process video and save detections to json file¶
The InferencePipeline
interface is made for streaming and is likely the best route to go for real time use cases. It is an asynchronous interface that can consume many different video sources including local devices (like webcams), RTSP video streams, video files, etc. With this interface, you define the source of a video stream and sinks.
All the operations we plan to perform for each frame of our video - detection, tracking, annotation, and write to json - are encapsulated in a function named callback
.
def callback(predictions: dict, frame: VideoFrame) -> np.ndarray:
detections = sv.Detections.from_inference(predictions)
# Only keep person detections
detections = detections[detections.class_id == 0]
detections.data["class_name"] = np.array(["person" for _ in range(len(detections))])
detections = byte_track.update_with_detections(detections)
json_sink.append(detections, custom_data={'frame_number': frame.frame_id})
pipeline = InferencePipeline.init(
model_id=INFERENCE_MODEL,
video_reference=SOURCE_VIDEO_PATH,
on_prediction=callback,
iou_threshold=IOU_THRESHOLD,
confidence=CONFIDENCE_THRESHOLD,
)
pipeline.start()
pipeline.join()
json_sink.write_and_close()
Visualizate results of detections json data with Pandas¶
Let's take a look at our resulting data with by using Pandas.
It will also be created in your current directory with the name detections.json as well.
df = pd.read_json(FILE_NAME)
df
x_min | y_min | x_max | y_max | class_id | confidence | tracker_id | class_name | frame_number | |
---|---|---|---|---|---|---|---|---|---|
0 | 1142 | 950 | 1245 | 1080 | 0 | 0.767850 | 365 | person | 1 |
1 | 750 | 450 | 823 | 620 | 0 | 0.748268 | 348 | person | 1 |
2 | 1419 | 702 | 1526 | 887 | 0 | 0.550286 | 359 | person | 1 |
3 | 1674 | 12 | 1727 | 141 | 0 | 0.546864 | 354 | person | 1 |
4 | 1616 | 18 | 1655 | 127 | 0 | 0.520566 | 361 | person | 1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
8757 | 394 | 471 | 460 | 633 | 0 | 0.466400 | 424 | person | 341 |
8758 | 115 | 256 | 199 | 399 | 0 | 0.431374 | 403 | person | 341 |
8759 | 77 | 700 | 169 | 892 | 0 | 0.416318 | 420 | person | 341 |
8760 | 1010 | 59 | 1055 | 174 | 0 | 0.403422 | 426 | person | 341 |
8761 | 1758 | 0 | 1809 | 96 | 0 | 0.313793 | 416 | person | 341 |
8762 rows × 9 columns
Convert JSON data to sv.Detections¶
def json_to_detections(json_file: str) -> List[sv.Detections]:
rows_by_frame_number = defaultdict(list)
with open(json_file, "r") as f:
data = json.load(f)
for row in data:
frame_number = int(row["frame_number"])
rows_by_frame_number[frame_number].append(row)
detections_list = []
for frame_number, rows in rows_by_frame_number.items():
xyxy = []
class_id = []
confidence = []
tracker_id = []
custom_data = defaultdict(list)
for row in rows:
xyxy.append([row[key] for key in ["x_min", "y_min", "x_max", "y_max"]])
class_id.append(row["class_id"])
confidence.append(row["confidence"])
tracker_id.append(row["tracker_id"])
for custom_key in row.keys():
if custom_key in ["x_min", "y_min", "x_max", "y_max", "class_id", "confidence", "tracker_id"]:
continue
custom_data[custom_key].append(row[custom_key])
if all([val == "" for val in class_id]):
class_id = None
if all([val == "" for val in confidence]):
confidence = None
if all([val == "" for val in tracker_id]):
tracker_id = None
detections_list.append(
sv.Detections(
xyxy=np.array(xyxy, dtype=np.float32),
class_id=np.array(class_id, dtype=int),
confidence=np.array(confidence, dtype=np.float32),
tracker_id=np.array(tracker_id, dtype=int),
data=dict(custom_data)
)
)
return detections_list
detections_list = json_to_detections(FILE_NAME)
detections_list
print(f"Detections: {len(detections_list)}")
print(detections_list[0])
Detections: 341 Detections(xyxy=array([[1142., 950., 1245., 1080.], [ 750., 450., 823., 620.], [1419., 702., 1526., 887.], [1674., 12., 1727., 141.], [1616., 18., 1655., 127.], [1388., 2., 1437., 184.]], dtype=float32), mask=None, confidence=array([0.7678498 , 0.7482683 , 0.5502863 , 0.5468636 , 0.5205659 , 0.31684005], dtype=float32), class_id=array([0, 0, 0, 0, 0, 0]), tracker_id=array([365, 348, 359, 354, 361, 28]), data={'class_name': ['person', 'person', 'person', 'person', 'person', 'person'], 'frame_number': [1, 1, 1, 1, 1, 1]})
Annotate First Frame¶
Visualize the first frame of a video alongside the initial detections obtained by parsing JSON data into sv.Detections
objects. The annotated image will show the original video frame, marked with the first bounding box detected from the parsed data, providing a visual representation of the identified object(s) in the scene.
Get back sv.Detections
¶
FRAME_NUMBER = 100
detections = detections_list[FRAME_NUMBER]
frame_number = detections.data["frame_number"][0]
generator = sv.get_video_frames_generator(SOURCE_VIDEO_PATH, start=FRAME_NUMBER)
frame = next(generator)
First frame from video (Before Annotate)¶
Annotate Image with Detections¶
Finally, we can annotate the image with the predictions. Since we are working with an object detection model, we will use the sv.BoundingBoxAnnotator
and sv.LabelAnnotator
classes.
bounding_box_annotator = sv.BoundingBoxAnnotator()
label_annotator = sv.LabelAnnotator()
annotated_frame = frame.copy()
annotated_frame = bounding_box_annotator.annotate(scene=annotated_frame, detections=detections)
annotated_frame = label_annotator.annotate(scene=annotated_frame, detections=detections)
sv.plot_image(annotated_frame, (12, 12))
References 📚¶
- Supervision: https://supervision.roboflow.com
- sv.Detections: https://supervision.roboflow.com/develop/detection/core/#detections
- Save Detections to JSON: https://supervision.roboflow.com/develop/how_to/save_detections/#save-detections-as-json
- Custom fields: https://supervision.roboflow.com/develop/how_to/save_detections/#custom-fields
- ByteTrack: https://supervision.roboflow.com/trackers/#supervision.tracker.byte_tracker.core.ByteTrack
- Inference: https://inference.roboflow.com/
- Inference Pipeline: https://inference.roboflow.com/using_inference/inference_pipeline/
- Inference Aliases: https://inference.roboflow.com/quickstart/aliases/