Yolov5笔记--自适应图片缩放letterbox-Toy模板网

这篇具有很好参考价值的文章主要介绍了Yolov5笔记--自适应图片缩放letterbox。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

1--原理及作用

具体分析请参考博客1；

简单阐述：letterbox()函数的作用是将图像缩放到指定尺寸，因为直接resize到指定尺寸会导致信息的丢失，而采用等比例缩放的形式，能较好地保留图像的信息；

Yolov5采用自适应缩放确保图片宽高值最大为640，同时通过padding填充像素值的方式确保宽高能被32整除，最大程度地利用感受野；

2--测试代码

import cv2
import numpy as np

def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
    
    shape = im.shape[:2]  # 计算当前帧的宽高

    # 判断传入的new_shape是否是一个整数，传入的参数可能是单个整数640，表示宽×高为640×640
    if isinstance(new_shape, int): 
        new_shape = (new_shape, new_shape)

    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1]) # 缩放比

    if not scaleup:  # 只缩小，不放大，保证比例最大为1.0
        r = min(r, 1.0)

    
    ratio = r, r  # 宽和高的缩放比
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) # 计算缩放后图片的宽和高
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # 计算目标宽高和缩放后宽高的差值，即需要padding的大小
    if auto: # 采用自适应图片缩放，确保宽和高都能被stride整除，因此需要补边
        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # 需要padding的大小，需要被32整除
    elif scaleFill:  # scaleFill表示不采用自适应缩放，直接resize到目标shape，无需补边
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])
        ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # 缩放比

    # 上下和左右两侧各 padding 一半
    dw /= 2  
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)

    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1)) # 上下两侧需要padding的大小
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1)) # 左右两侧需要padding的大小
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # 填充指定大小的固定像素值value
    return im, ratio, (dw, dh)


if __name__ == "__main__":
    img1 = cv2.imread("./test1.jpg")
    print("img1.shape: ", img1.shape)

    new_img1 = letterbox(img1)[0] # 取第一个返回值
    print("new_img1.shape: ", new_img1.shape)

    img2 = np.random.randint(low = 0, high = 255, size=(630, 470, 3), dtype=np.uint8)
    print("img2.shape: ", img2.shape)
    new_img2 = letterbox(img2)[0] # 取第一个返回值
    print("new_img2.shape: ", new_img2.shape)

    img3 = np.random.randint(low = 0, high = 255, size=(470, 470, 3), dtype=np.uint8)
    print("img3.shape: ", img3.shape)
    new_img3 = letterbox(img3)[0] # 取第一个返回值
    print("new_img3.shape: ", new_img3.shape)

yolov5 letterbox,深度学习笔记,python,计算机视觉,人工智能