Skip to content

Datasets UtilsΒΆ

Converts run-length encoding (RLE) to a binary mask.


Name Type Description Default
rle Union[NDArray[int_], List[int]]

The 1D RLE array, the format used in the COCO dataset (column-wise encoding, values of an array with even indices represent the number of pixels assigned as background, values of an array with odd indices represent the number of pixels assigned as foreground object).

resolution_wh Tuple[int, int]

The width (w) and height (h) of the desired binary mask.



Type Description

The generated 2D Boolean mask of shape (h, w), where the foreground object is marked with True's and the rest is filled with False's.


Type Description

If the sum of pixels encoded in RLE differs from the number of pixels in the expected mask (computed based on resolution_wh).


import supervision as sv

sv.rle_to_mask([5, 2, 2, 2, 5], (4, 4))
# array([
#     [False, False, False, False],
#     [False, True,  True,  False],
#     [False, True,  True,  False],
#     [False, False, False, False],
# ])
Source code in supervision/dataset/
def rle_to_mask(
    rle: Union[npt.NDArray[np.int_], List[int]], resolution_wh: Tuple[int, int]
) -> npt.NDArray[np.bool_]:
    Converts run-length encoding (RLE) to a binary mask.

        rle (Union[npt.NDArray[np.int_], List[int]]): The 1D RLE array, the format
            used in the COCO dataset (column-wise encoding, values of an array with
            even indices represent the number of pixels assigned as background,
            values of an array with odd indices represent the number of pixels
            assigned as foreground object).
        resolution_wh (Tuple[int, int]): The width (w) and height (h)
            of the desired binary mask.

        The generated 2D Boolean mask of shape `(h, w)`, where the foreground object is
            marked with `True`'s and the rest is filled with `False`'s.

        AssertionError: If the sum of pixels encoded in RLE differs from the
            number of pixels in the expected mask (computed based on resolution_wh).

        import supervision as sv

        sv.rle_to_mask([5, 2, 2, 2, 5], (4, 4))
        # array([
        #     [False, False, False, False],
        #     [False, True,  True,  False],
        #     [False, True,  True,  False],
        #     [False, False, False, False],
        # ])
    if isinstance(rle, list):
        rle = np.array(rle, dtype=int)

    width, height = resolution_wh

    assert width * height == np.sum(rle), (
        "the sum of the number of pixels in the RLE must be the same "
        "as the number of pixels in the expected mask"

    zero_one_values = np.zeros(shape=(rle.size, 1), dtype=np.uint8)
    zero_one_values[1::2] = 1

    decoded_rle = np.repeat(zero_one_values, rle, axis=0)
    decoded_rle = np.append(
        decoded_rle, np.zeros(width * height - len(decoded_rle), dtype=np.uint8)
    return decoded_rle.reshape((height, width), order="F")

Converts a binary mask into a run-length encoding (RLE).


Name Type Description Default
mask NDArray[bool_]

2D binary mask where True indicates foreground object and False indicates background.



Type Description

The run-length encoded mask. Values of a list with even indices represent the number of pixels assigned as background (False), values of a list with odd indices represent the number of pixels assigned as foreground object (True).


Type Description

If input mask is not 2D or is empty.


import numpy as np
import supervision as sv

mask = np.array([
    [True, True, True, True],
    [True, True, True, True],
    [True, True, True, True],
    [True, True, True, True],
# [0, 16]

mask = np.array([
    [False, False, False, False],
    [False, True,  True,  False],
    [False, True,  True,  False],
    [False, False, False, False],
# [5, 2, 2, 2, 5]


Source code in supervision/dataset/
def mask_to_rle(mask: npt.NDArray[np.bool_]) -> List[int]:
    Converts a binary mask into a run-length encoding (RLE).

        mask (npt.NDArray[np.bool_]): 2D binary mask where `True` indicates foreground
            object and `False` indicates background.

        The run-length encoded mask. Values of a list with even indices
            represent the number of pixels assigned as background (`False`), values
            of a list with odd indices represent the number of pixels assigned
            as foreground object (`True`).

        AssertionError: If input mask is not 2D or is empty.

        import numpy as np
        import supervision as sv

        mask = np.array([
            [True, True, True, True],
            [True, True, True, True],
            [True, True, True, True],
            [True, True, True, True],
        # [0, 16]

        mask = np.array([
            [False, False, False, False],
            [False, True,  True,  False],
            [False, True,  True,  False],
            [False, False, False, False],
        # [5, 2, 2, 2, 5]

    ![mask_to_rle]({ align=center width="800" }
    """  # noqa E501 // docs
    assert mask.ndim == 2, "Input mask must be 2D"
    assert mask.size != 0, "Input mask cannot be empty"

    on_value_change_indices = np.where(
        mask.ravel(order="F") != np.roll(mask.ravel(order="F"), 1)

    on_value_change_indices = np.append(on_value_change_indices, mask.size)
    # need to add 0 at the beginning when the same value is in the first and
    # last element of the flattened mask
    if on_value_change_indices[0] != 0:
        on_value_change_indices = np.insert(on_value_change_indices, 0, 0)

    rle = np.diff(on_value_change_indices)

    if mask[0][0] == 1:
        rle = np.insert(rle, 0, 0)

    return list(rle)
