Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测

9月前作者：luoganttcc 分类：Toy博客阅读(40) 违法举报

这篇具有很好参考价值的文章主要介绍了Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

link

上一篇：Jetson AGX Xavier安装torch、torchvision且成功运行yolov5算法

下一篇：Jetson AGX Xavier测试YOLOv4

一、前言

由于YOLOv5在Xavier上对实时画面的检测速度较慢，需要采用TensorRT对其进行推理加速。接下来记录一下我的实现过程。

二、环境准备

如果还没有搭建YOLOv5的python环境，按照下文步骤执行。反之，直接跳过第一步执行第二步。

1、参考文章《Jetson AGX Xavier配置yolov5虚拟环境》建立YOLOv5的Python环境，并参照《Jetson AGX Xavier安装Archiconda虚拟环境管理器与在虚拟环境中调用opencv》，将opencv导入环境，本文Opencv采用的是3.4.3版本。

2、在环境中导入TensorRT的库。与opencv的导入相同。将路径 /usr/lib/python3.6/dist-packages/ 下关于TensorRT的文件夹，复制到自己所创建环境的site-packages文件夹下。例如：复制到/home/jetson/archiconda3/envs/yolov5env/lib/python3.6/site-packages/之下。

3、在环境中安装pycuda，如果pip安装不成功，网上有许多解决办法。


    
    

      
      
       
       
      
      
      
      
       
       
        
        conda activate yolov
        
        5env
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        pip install pycuda

三、加速步骤

以加速YOLOv5s模型为例，以下有v4.0与v5.0两个版本，大家任选其一即可。

1、克隆工程

①v4.0


    
    

      
      
       
       
      
      
      
      
       
       
        
        git 
        
        clone -b v4.0 https://github.com/ultralytics/yolov5.git 
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        git 
        
        clone -b yolov5-v4.0 https://github.com/wang-xinyu/tensorrtx.git

②v5.0


    
    

      
      
       
       
      
      
      
      
       
       
        
        git 
        
        clone -b v5.0 https://github.com/ultralytics/yolov5.git 
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        git 
        
        clone -b yolov5-v5.0 https://github.com/wang-xinyu/tensorrtx.git

2、生成引擎文件

①下载yolov5s.pt到yolov5工程的weights文件夹下。

②复制tensorrtx/yolov5文件夹下的gen_wts.py文件到yolov5工程下。

③生成yolov5s.wts文件。


    
    

      
      
       
       
      
      
      
      
       
       
        
        conda activate yolov5env
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        cd /xxx/yolov5
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        以下按照自己所下版本选择
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        #v4.0
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        python gen_wts.py
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        #v5.0
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        python gen_wts.py -w yolov5s.pt -o yolov5s.wts

④生成引擎文件

进入tensorrtx/yolov5文件夹下。

mkdir build

复制yolov5工程中生成的yolov5s.wts文件到tensorrtx/yolov5/build文件夹中。并在build文件夹中打开终端：


    
    

      
      
       
       
      
      
      
      
       
       
        
        cmake ..
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        make
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        #v4.0   sudo ./yolov5 -s [.wts] [.engine] [s/m/l/x/]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        #v5.0   sudo ./yolov5 -s [.wts] [.engine] [s/m/l/x/s6/m6/l6/x6 or c/c6 gd gw]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        sudo ./yolov5 -s yolov5s.wts yolov5s.engine s

生成yolov5s.engine文件。

四、加速实现

1、图片检测加速


    
    

      
      
       
       
      
      
      
      
       
       
        
        sudo ./yolov5 -d yolov5s.engine ../samples
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        或者
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        conda activate yolov5env
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        python yolov5_trt.py

2、摄像头实时检测加速

由于本人没有学习过C++语言，所以只能硬着头皮修改了下yolov5_trt.py脚本，脚本的代码格式较差，但是能够实现加速，有需要的可以作为一个参考。

在tensorrt工程下新建一个yolo_trt_test.py文件。复制下面 v4.0或者v5.0的代码到yolo_trt_test.py。注意yolov5s.engine的路径，自行更改。

①v4.0代码


    
    

      
      
       
       
      
      
      
      
       
       
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        An example that uses TensorRT's Python api to make inferences.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import ctypes
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import os
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import random
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import sys
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import threading
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import time
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import cv2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import numpy 
        
        as np
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import pycuda.autoinit
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import pycuda.driver 
        
        as cuda
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import tensorrt 
        
        as trt
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import torch
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import torchvision
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        INPUT_W = 
        
        608
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        INPUT_H = 
        
        608
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        CONF_THRESH = 
        
        0.15
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        IOU_THRESHOLD = 
        
        0.45
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        int_box=[
        
        0,
        
        0,
        
        0,
        
        0]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        int_box1=[
        
        0,
        
        0,
        
        0,
        
        0]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        fps1=
        
        0.0
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        def 
        
        plot_one_box(
        
        x, img, color=None, label=None, line_thickness=None):
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            description: Plots one bounding box on image img,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                         this function comes from YoLov5 project.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            param:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                x:      a box likes [x1,y1,x2,y2]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                img:    a opencv image object
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                color:  color to draw rectangle, such as (0,255,0)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                label:  str
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                line_thickness: int
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            return:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                no return
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            tl = (
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                line_thickness 
        
        or 
        
        round(
        
        0.002 * (img.shape[
        
        0] + img.shape[
        
        1]) / 
        
        2) + 
        
        1
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            )  
        
        # line/font thickness
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            color = color 
        
        or [random.randint(
        
        0, 
        
        255) 
        
        for _ 
        
        in 
        
        range(
        
        3)]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            c1, c2 = (
        
        int(x[
        
        0]), 
        
        int(x[
        
        1])), (
        
        int(x[
        
        2]), 
        
        int(x[
        
        3]))
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            C2 = c2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        if label:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                tf = 
        
        max(tl - 
        
        1, 
        
        1)  
        
        # font thickness
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                t_size = cv2.getTextSize(label, 
        
        0, fontScale=tl / 
        
        3, thickness=tf)[
        
        0]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                c2 = c1[
        
        0] + t_size[
        
        0], c1[
        
        1] + t_size[
        
        1] + 
        
        8
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cv2.rectangle(img, c1, c2, color, -
        
        1, cv2.LINE_AA)  
        
        # filled
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cv2.putText(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    img,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    label,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    (c1[
        
        0], c1[
        
        1]+t_size[
        
        1] + 
        
        5),
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        0,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tl / 
        
        3,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    [
        
        255,
        
        255,
        
        255],
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    thickness=tf,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    lineType=cv2.LINE_AA,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                )
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        class 
        
        YoLov5TRT(
        
        object):
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            description: A YOLOv5 class that warps TensorRT ops, preprocess and postprocess ops.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            """
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        __init__(
        
        self, engine_file_path):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Create a Context on this device,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.cfx = cuda.Device(
        
        0).make_context()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                stream = cuda.Stream()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                TRT_LOGGER = trt.Logger(trt.Logger.INFO)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                runtime = trt.Runtime(TRT_LOGGER)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Deserialize the engine from file
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        with 
        
        open(engine_file_path, 
        
        "rb") 
        
        as f:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    engine = runtime.deserialize_cuda_engine(f.read())
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                context = engine.create_execution_context()
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                host_inputs = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda_inputs = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                host_outputs = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda_outputs = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                bindings = []
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        for binding 
        
        in engine:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    dtype = trt.nptype(engine.get_binding_dtype(binding))
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        # Allocate host and device buffers
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    host_mem = cuda.pagelocked_empty(size, dtype)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    cuda_mem = cuda.mem_alloc(host_mem.nbytes)
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        # Append the device buffer to device bindings.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    bindings.append(
        
        int(cuda_mem))
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        # Append to the appropriate list.
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        if engine.binding_is_input(binding):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        host_inputs.append(host_mem)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        cuda_inputs.append(cuda_mem)
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        else:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        host_outputs.append(host_mem)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        cuda_outputs.append(cuda_mem)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Store
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.stream = stream
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.context = context
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.engine = engine
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.host_inputs = host_inputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.cuda_inputs = cuda_inputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.host_outputs = host_outputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.cuda_outputs = cuda_outputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.bindings = bindings
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        infer(
        
        self, input_image_path):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        global int_box,int_box1,fps1
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # threading.Thread.__init__(self)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Make self the active context, pushing it on top of the context stack.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.cfx.push()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Restore
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                stream = self.stream
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                context = self.context
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                engine = self.engine
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                host_inputs = self.host_inputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda_inputs = self.cuda_inputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                host_outputs = self.host_outputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda_outputs = self.cuda_outputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                bindings = self.bindings
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Do image preprocess
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                input_image, image_raw, origin_h, origin_w = self.preprocess_image(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    input_image_path
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                )
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Copy input image to host buffer
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                np.copyto(host_inputs[
        
        0], input_image.ravel())
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Transfer input data  to the GPU.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda.memcpy_htod_async(cuda_inputs[
        
        0], host_inputs[
        
        0], stream)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Run inference.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                context.execute_async(bindings=bindings, stream_handle=stream.handle)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Transfer predictions back from the GPU.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda.memcpy_dtoh_async(host_outputs[
        
        0], cuda_outputs[
        
        0], stream)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Synchronize the stream
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                stream.synchronize()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Remove any context from the top of the context stack, deactivating it.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.cfx.pop()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Here we use the first row of output in that batch_size = 1
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                output = host_outputs[
        
        0]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Do postprocess
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                result_boxes, result_scores, result_classid = self.post_process(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    output, origin_h, origin_w
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                )
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Draw rectangles and labels on the original image
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        for i 
        
        in 
        
        range(
        
        len(result_boxes)):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    box1 = result_boxes[i]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    plot_one_box(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        box1,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        image_raw,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        label=
        
        "{}:{:.2f}".
        
        format(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                            categories[
        
        int(result_classid[i])], result_scores[i]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        ),
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    )
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        return image_raw
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # parent, filename = os.path.split(input_image_path)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # save_name = os.path.join(parent, "output_" + filename)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # # 　Save image
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # cv2.imwrite(save_name, image_raw)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        destroy(
        
        self):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Remove any context from the top of the context stack, deactivating it.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.cfx.pop()
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        preprocess_image(
        
        self, input_image_path):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                description: Read an image from image path, convert it to RGB,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                             resize and pad it to target size, normalize to [0,1],
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                             transform to NCHW format.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                param:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    input_image_path: str, image path
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                return:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    image:  the processed image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    image_raw: the original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    h: original height
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    w: original width
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image_raw = input_image_path
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                h, w, c = image_raw.shape
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Calculate widht and height and paddings
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                r_w = INPUT_W / w
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                r_h = INPUT_H / h
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        if r_h > r_w:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tw = INPUT_W
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    th = 
        
        int(r_w * h)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tx1 = tx2 = 
        
        0
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    ty1 = 
        
        int((INPUT_H - th) / 
        
        2)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    ty2 = INPUT_H - th - ty1
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        else:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tw = 
        
        int(r_h * w)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    th = INPUT_H
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tx1 = 
        
        int((INPUT_W - tw) / 
        
        2)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tx2 = INPUT_W - tw - tx1
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    ty1 = ty2 = 
        
        0
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Resize the image with long side while maintaining ratio
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = cv2.resize(image, (tw, th))
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Pad the short side with (128,128,128)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = cv2.copyMakeBorder(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, (
        
        128, 
        
        128, 
        
        128)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                )
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = image.astype(np.float32)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Normalize to [0,1]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image /= 
        
        255.0
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # HWC to CHW format:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = np.transpose(image, [
        
        2, 
        
        0, 
        
        1])
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # CHW to NCHW format
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = np.expand_dims(image, axis=
        
        0)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Convert the image to row-major order, also known as "C order":
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = np.ascontiguousarray(image)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        return image, image_raw, h, w
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        xywh2xyxy(
        
        self, origin_h, origin_w, x):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                description:    Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                param:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    origin_h:   height of original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    origin_w:   width of original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    x:          A boxes tensor, each row is a box [center_x, center_y, w, h]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                return:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y:          A boxes tensor, each row is a box [x1, y1, x2, y2]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                y = torch.zeros_like(x) 
        
        if 
        
        isinstance(x, torch.Tensor) 
        
        else np.zeros_like(x)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                r_w = INPUT_W / origin_w
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                r_h = INPUT_H / origin_h
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        if r_h > r_w:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        0] = x[:, 
        
        0] - x[:, 
        
        2] / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        2] = x[:, 
        
        0] + x[:, 
        
        2] / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        1] = x[:, 
        
        1] - x[:, 
        
        3] / 
        
        2 - (INPUT_H - r_w * origin_h) / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        3] = x[:, 
        
        1] + x[:, 
        
        3] / 
        
        2 - (INPUT_H - r_w * origin_h) / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y /= r_w
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        else:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        0] = x[:, 
        
        0] - x[:, 
        
        2] / 
        
        2 - (INPUT_W - r_h * origin_w) / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        2] = x[:, 
        
        0] + x[:, 
        
        2] / 
        
        2 - (INPUT_W - r_h * origin_w) / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        1] = x[:, 
        
        1] - x[:, 
        
        3] / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        3] = x[:, 
        
        1] + x[:, 
        
        3] / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y /= r_h
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        return y
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        post_process(
        
        self, output, origin_h, origin_w):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                description: postprocess the prediction
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                param:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    output:     A tensor likes [num_boxes,cx,cy,w,h,conf,cls_id, cx,cy,w,h,conf,cls_id, ...]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    origin_h:   height of original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    origin_w:   width of original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                return:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    result_boxes: finally boxes, a boxes tensor, each row is a box [x1, y1, x2, y2]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    result_scores: finally scores, a tensor, each element is the score correspoing to box
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    result_classid: finally classid, a tensor, each element is the classid correspoing to box
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                """
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Get the num of boxes detected
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                num = 
        
        int(output[
        
        0])
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Reshape to a two dimentional ndarray
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                pred = np.reshape(output[
        
        1:], (-
        
        1, 
        
        6))[:num, :]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # to a torch Tensor
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                pred = torch.Tensor(pred).cuda()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Get the boxes
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                boxes = pred[:, :
        
        4]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Get the scores
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                scores = pred[:, 
        
        4]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Get the classid
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                classid = pred[:, 
        
        5]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Choose those boxes that score > CONF_THRESH
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                si = scores > CONF_THRESH
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                boxes = boxes[si, :]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                scores = scores[si]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                classid = classid[si]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                boxes = self.xywh2xyxy(origin_h, origin_w, boxes)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Do nms
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                indices = torchvision.ops.nms(boxes, scores, iou_threshold=IOU_THRESHOLD).cpu()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                result_boxes = boxes[indices, :].cpu()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                result_scores = scores[indices].cpu()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                result_classid = classid[indices].cpu()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        return result_boxes, result_scores, result_classid
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        class 
        
        myThread(threading.Thread):
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        __init__(
        
        self, func, args):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                threading.Thread.__init__(self)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.func = func
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.args = args
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        run(
        
        self):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.func(*self.args)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        if __name__ == 
        
        "__main__":
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        # load custom plugins
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            PLUGIN_LIBRARY = 
        
        "build/libmyplugins.so"
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            ctypes.CDLL(PLUGIN_LIBRARY)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            engine_file_path = 
        
        "yolov5s.engine"
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        # load coco labels
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            categories = [
        
        "person", 
        
        "bicycle", 
        
        "car", 
        
        "motorcycle", 
        
        "airplane", 
        
        "bus", 
        
        "train", 
        
        "truck", 
        
        "boat", 
        
        "traffic light",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "fire hydrant", 
        
        "stop sign", 
        
        "parking meter", 
        
        "bench", 
        
        "bird", 
        
        "cat", 
        
        "dog", 
        
        "horse", 
        
        "sheep", 
        
        "cow",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "elephant", 
        
        "bear", 
        
        "zebra", 
        
        "giraffe", 
        
        "backpack", 
        
        "umbrella", 
        
        "handbag", 
        
        "tie", 
        
        "suitcase", 
        
        "frisbee",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "skis", 
        
        "snowboard", 
        
        "sports ball", 
        
        "kite", 
        
        "baseball bat", 
        
        "baseball glove", 
        
        "skateboard", 
        
        "surfboard",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "tennis racket", 
        
        "bottle", 
        
        "wine glass", 
        
        "cup", 
        
        "fork", 
        
        "knife", 
        
        "spoon", 
        
        "bowl", 
        
        "banana", 
        
        "apple",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "sandwich", 
        
        "orange", 
        
        "broccoli", 
        
        "carrot", 
        
        "hot dog", 
        
        "pizza", 
        
        "donut", 
        
        "cake", 
        
        "chair", 
        
        "couch",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "potted plant", 
        
        "bed", 
        
        "dining table", 
        
        "toilet", 
        
        "tv", 
        
        "laptop", 
        
        "mouse", 
        
        "remote", 
        
        "keyboard", 
        
        "cell phone",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "microwave", 
        
        "oven", 
        
        "toaster", 
        
        "sink", 
        
        "refrigerator", 
        
        "book", 
        
        "clock", 
        
        "vase", 
        
        "scissors", 
        
        "teddy bear",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "hair drier", 
        
        "toothbrush"]
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        # a  YoLov5TRT instance
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            yolov5_wrapper = YoLov5TRT(engine_file_path)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            cap = cv2.VideoCapture(
        
        0)
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        while 
        
        1:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                _,image =cap.read()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                img=yolov5_wrapper.infer(image)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cv2.imshow(
        
        "result", img)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        if cv2.waitKey(
        
        1) & 
        
        0XFF == 
        
        ord(
        
        'q'):  
        
        # 1 millisecond
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        break
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            cap.release()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            cv2.destroyAllWindows()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            yolov5_wrapper.destroy()

②v5.0代码


    
    

      
      
       
       
      
      
      
      
       
       
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        An example that uses TensorRT's Python api to make inferences.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import ctypes
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import os
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import shutil
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import random
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import sys
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import threading
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import time
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import cv2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import numpy 
        
        as np
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import pycuda.autoinit
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import pycuda.driver 
        
        as cuda
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import tensorrt 
        
        as trt
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import torch
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import torchvision
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        import argparse
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        CONF_THRESH = 
        
        0.5
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        IOU_THRESHOLD = 
        
        0.4
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        def 
        
        get_img_path_batches(
        
        batch_size, img_dir):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            ret = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            batch = []
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        for root, dirs, files 
        
        in os.walk(img_dir):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        for name 
        
        in files:
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        if 
        
        len(batch) == batch_size:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        ret.append(batch)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        batch = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    batch.append(os.path.join(root, name))
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        if 
        
        len(batch) > 
        
        0:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                ret.append(batch)
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        return ret
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        def 
        
        plot_one_box(
        
        x, img, color=None, label=None, line_thickness=None):
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            description: Plots one bounding box on image img,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                         this function comes from YoLov5 project.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            param: 
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                x:      a box likes [x1,y1,x2,y2]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                img:    a opencv image object
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                color:  color to draw rectangle, such as (0,255,0)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                label:  str
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                line_thickness: int
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            return:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                no return
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            tl = (
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                line_thickness 
        
        or 
        
        round(
        
        0.002 * (img.shape[
        
        0] + img.shape[
        
        1]) / 
        
        2) + 
        
        1
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            )  
        
        # line/font thickness
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            color = color 
        
        or [random.randint(
        
        0, 
        
        255) 
        
        for _ 
        
        in 
        
        range(
        
        3)]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            c1, c2 = (
        
        int(x[
        
        0]), 
        
        int(x[
        
        1])), (
        
        int(x[
        
        2]), 
        
        int(x[
        
        3]))
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        if label:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                tf = 
        
        max(tl - 
        
        1, 
        
        1)  
        
        # font thickness
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                t_size = cv2.getTextSize(label, 
        
        0, fontScale=tl / 
        
        3, thickness=tf)[
        
        0]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                c2 = c1[
        
        0] + t_size[
        
        0], c1[
        
        1] - t_size[
        
        1] - 
        
        3
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cv2.rectangle(img, c1, c2, color, -
        
        1, cv2.LINE_AA)  
        
        # filled
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cv2.putText(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    img,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    label,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    (c1[
        
        0], c1[
        
        1] - 
        
        2),
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        0,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tl / 
        
        3,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    [
        
        225, 
        
        255, 
        
        255],
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    thickness=tf,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    lineType=cv2.LINE_AA,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                )
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        class 
        
        YoLov5TRT(
        
        object):
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            description: A YOLOv5 class that warps TensorRT ops, preprocess and postprocess ops.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            """
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        __init__(
        
        self, engine_file_path):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Create a Context on this device,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.ctx = cuda.Device(
        
        0).make_context()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                stream = cuda.Stream()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                TRT_LOGGER = trt.Logger(trt.Logger.INFO)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                runtime = trt.Runtime(TRT_LOGGER)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Deserialize the engine from file
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        with 
        
        open(engine_file_path, 
        
        "rb") 
        
        as f:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    engine = runtime.deserialize_cuda_engine(f.read())
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                context = engine.create_execution_context()
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                host_inputs = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda_inputs = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                host_outputs = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda_outputs = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                bindings = []
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        for binding 
        
        in engine:
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        print(
        
        'bingding:', binding, engine.get_binding_shape(binding))
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    dtype = trt.nptype(engine.get_binding_dtype(binding))
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        # Allocate host and device buffers
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    host_mem = cuda.pagelocked_empty(size, dtype)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    cuda_mem = cuda.mem_alloc(host_mem.nbytes)
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        # Append the device buffer to device bindings.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    bindings.append(
        
        int(cuda_mem))
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        # Append to the appropriate list.
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        if engine.binding_is_input(binding):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        self.input_w = engine.get_binding_shape(binding)[-
        
        1]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        self.input_h = engine.get_binding_shape(binding)[-
        
        2]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        host_inputs.append(host_mem)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        cuda_inputs.append(cuda_mem)
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        else:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        host_outputs.append(host_mem)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        cuda_outputs.append(cuda_mem)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Store
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.stream = stream
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.context = context
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.engine = engine
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.host_inputs = host_inputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.cuda_inputs = cuda_inputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.host_outputs = host_outputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.cuda_outputs = cuda_outputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.bindings = bindings
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.batch_size = engine.max_batch_size
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        infer(
        
        self, input_image_path):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                threading.Thread.__init__(self)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Make self the active context, pushing it on top of the context stack.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.ctx.push()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.input_image_path = input_image_path
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Restore
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                stream = self.stream
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                context = self.context
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                engine = self.engine
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                host_inputs = self.host_inputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda_inputs = self.cuda_inputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                host_outputs = self.host_outputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda_outputs = self.cuda_outputs
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                bindings = self.bindings
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Do image preprocess
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                batch_image_raw = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                batch_origin_h = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                batch_origin_w = []
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                batch_input_image = np.empty(shape=[self.batch_size, 
        
        3, self.input_h, self.input_w])
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                input_image, image_raw, origin_h, origin_w = self.preprocess_image(input_image_path
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                                                                                   )
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                batch_origin_h.append(origin_h)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                batch_origin_w.append(origin_w)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                np.copyto(batch_input_image, input_image)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                batch_input_image = np.ascontiguousarray(batch_input_image)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Copy input image to host buffer
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                np.copyto(host_inputs[
        
        0], batch_input_image.ravel())
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                start = time.time()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Transfer input data  to the GPU.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda.memcpy_htod_async(cuda_inputs[
        
        0], host_inputs[
        
        0], stream)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Run inference.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                context.execute_async(batch_size=self.batch_size, bindings=bindings, stream_handle=stream.handle)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Transfer predictions back from the GPU.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cuda.memcpy_dtoh_async(host_outputs[
        
        0], cuda_outputs[
        
        0], stream)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Synchronize the stream
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                stream.synchronize()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                end = time.time()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Remove any context from the top of the context stack, deactivating it.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.ctx.pop()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Here we use the first row of output in that batch_size = 1
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                output = host_outputs[
        
        0]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Do postprocess
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                result_boxes, result_scores, result_classid = self.post_process(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    output, origin_h, origin_w)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Draw rectangles and labels on the original image
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        for j 
        
        in 
        
        range(
        
        len(result_boxes)):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    box = result_boxes[j]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    plot_one_box(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        box,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        image_raw,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        label=
        
        "{}:{:.2f}".
        
        format(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                            categories[
        
        int(result_classid[j])], result_scores[j]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                        ),
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    )
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        return image_raw, end - start
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        destroy(
        
        self):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Remove any context from the top of the context stack, deactivating it.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.ctx.pop()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        get_raw_image(
        
        self, image_path_batch):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                description: Read an image from image path
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                """
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        for img_path 
        
        in image_path_batch:
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        yield cv2.imread(img_path)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        get_raw_image_zeros(
        
        self, image_path_batch=None):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                description: Ready data for warmup
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                """
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        for _ 
        
        in 
        
        range(self.batch_size):
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        yield np.zeros([self.input_h, self.input_w, 
        
        3], dtype=np.uint8)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        preprocess_image(
        
        self, input_image_path):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                description: Convert BGR image to RGB,
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                             resize and pad it to target size, normalize to [0,1],
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                             transform to NCHW format.
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                param:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    input_image_path: str, image path
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                return:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    image:  the processed image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    image_raw: the original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    h: original height
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    w: original width
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image_raw = input_image_path
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                h, w, c = image_raw.shape
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Calculate widht and height and paddings
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                r_w = self.input_w / w
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                r_h = self.input_h / h
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        if r_h > r_w:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tw = self.input_w
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    th = 
        
        int(r_w * h)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tx1 = tx2 = 
        
        0
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    ty1 = 
        
        int((self.input_h - th) / 
        
        2)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    ty2 = self.input_h - th - ty1
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        else:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tw = 
        
        int(r_h * w)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    th = self.input_h
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tx1 = 
        
        int((self.input_w - tw) / 
        
        2)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    tx2 = self.input_w - tw - tx1
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    ty1 = ty2 = 
        
        0
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Resize the image with long side while maintaining ratio
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = cv2.resize(image, (tw, th))
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Pad the short side with (128,128,128)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = cv2.copyMakeBorder(
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, (
        
        128, 
        
        128, 
        
        128)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                )
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = image.astype(np.float32)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Normalize to [0,1]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image /= 
        
        255.0
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # HWC to CHW format:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = np.transpose(image, [
        
        2, 
        
        0, 
        
        1])
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # CHW to NCHW format
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = np.expand_dims(image, axis=
        
        0)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Convert the image to row-major order, also known as "C order":
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                image = np.ascontiguousarray(image)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        return image, image_raw, h, w
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        xywh2xyxy(
        
        self, origin_h, origin_w, x):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                description:    Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                param:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    origin_h:   height of original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    origin_w:   width of original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    x:          A boxes tensor, each row is a box [center_x, center_y, w, h]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                return:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y:          A boxes tensor, each row is a box [x1, y1, x2, y2]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                y = torch.zeros_like(x) 
        
        if 
        
        isinstance(x, torch.Tensor) 
        
        else np.zeros_like(x)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                r_w = self.input_w / origin_w
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                r_h = self.input_h / origin_h
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        if r_h > r_w:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        0] = x[:, 
        
        0] - x[:, 
        
        2] / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        2] = x[:, 
        
        0] + x[:, 
        
        2] / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        1] = x[:, 
        
        1] - x[:, 
        
        3] / 
        
        2 - (self.input_h - r_w * origin_h) / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        3] = x[:, 
        
        1] + x[:, 
        
        3] / 
        
        2 - (self.input_h - r_w * origin_h) / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y /= r_w
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        else:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        0] = x[:, 
        
        0] - x[:, 
        
        2] / 
        
        2 - (self.input_w - r_h * origin_w) / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        2] = x[:, 
        
        0] + x[:, 
        
        2] / 
        
        2 - (self.input_w - r_h * origin_w) / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        1] = x[:, 
        
        1] - x[:, 
        
        3] / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y[:, 
        
        3] = x[:, 
        
        1] + x[:, 
        
        3] / 
        
        2
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    y /= r_h
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        return y
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        post_process(
        
        self, output, origin_h, origin_w):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        """
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                description: postprocess the prediction
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                param:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    output:     A tensor likes [num_boxes,cx,cy,w,h,conf,cls_id, cx,cy,w,h,conf,cls_id, ...] 
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    origin_h:   height of original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    origin_w:   width of original image
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                return:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    result_boxes: finally boxes, a boxes tensor, each row is a box [x1, y1, x2, y2]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    result_scores: finally scores, a tensor, each element is the score correspoing to box
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    result_classid: finally classid, a tensor, each element is the classid correspoing to box
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                """
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Get the num of boxes detected
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                num = 
        
        int(output[
        
        0])
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Reshape to a two dimentional ndarray
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                pred = np.reshape(output[
        
        1:], (-
        
        1, 
        
        6))[:num, :]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # to a torch Tensor
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                pred = torch.Tensor(pred).cuda()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Get the boxes
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                boxes = pred[:, :
        
        4]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Get the scores
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                scores = pred[:, 
        
        4]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Get the classid
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                classid = pred[:, 
        
        5]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Choose those boxes that score > CONF_THRESH
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                si = scores > CONF_THRESH
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                boxes = boxes[si, :]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                scores = scores[si]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                classid = classid[si]
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2]
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                boxes = self.xywh2xyxy(origin_h, origin_w, boxes)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # Do nms
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                indices = torchvision.ops.nms(boxes, scores, iou_threshold=IOU_THRESHOLD).cpu()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                result_boxes = boxes[indices, :].cpu()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                result_scores = scores[indices].cpu()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                result_classid = classid[indices].cpu()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        return result_boxes, result_scores, result_classid
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        class 
        
        inferThread(threading.Thread):
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        __init__(
        
        self, yolov5_wrapper):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                threading.Thread.__init__(self)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.yolov5_wrapper = yolov5_wrapper
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        infer(
        
        self , frame):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                batch_image_raw, use_time = self.yolov5_wrapper.infer(frame)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # for i, img_path in enumerate(self.image_path_batch):
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        #     parent, filename = os.path.split(img_path)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        #     save_name = os.path.join('output', filename)
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        #     # Save image
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        #     cv2.imwrite(save_name, batch_image_raw[i])
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # print('input->{}, time->{:.2f}ms, saving into output/'.format(self.image_path_batch, use_time * 1000))
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        return batch_image_raw,use_time
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        class 
        
        warmUpThread(threading.Thread):
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        __init__(
        
        self, yolov5_wrapper):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                threading.Thread.__init__(self)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                self.yolov5_wrapper = yolov5_wrapper
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        def 
        
        run(
        
        self):
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                batch_image_raw, use_time = self.yolov5_wrapper.infer(self.yolov5_wrapper.get_raw_image_zeros())
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        print(
        
        'warm_up->{}, time->{:.2f}ms'.
        
        format(batch_image_raw[
        
        0].shape, use_time * 
        
        1000))
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        if __name__ == 
        
        "__main__":
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        # load custom plugins
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            parser = argparse.ArgumentParser()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            parser.add_argument(
        
        '--engine', nargs=
        
        '+', 
        
        type=
        
        str, default=
        
        "build/yolov5s.engine", 
        
        help=
        
        '.engine path(s)')
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            parser.add_argument(
        
        '--save', 
        
        type=
        
        int, default=
        
        0, 
        
        help=
        
        'save?')
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            opt = parser.parse_args()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            PLUGIN_LIBRARY = 
        
        "build/libmyplugins.so"
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            engine_file_path = opt.engine
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            ctypes.CDLL(PLUGIN_LIBRARY)
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        # load coco labels
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            categories = [
        
        "person", 
        
        "bicycle", 
        
        "car", 
        
        "motorcycle", 
        
        "airplane", 
        
        "bus", 
        
        "train", 
        
        "truck", 
        
        "boat", 
        
        "traffic light",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "fire hydrant", 
        
        "stop sign", 
        
        "parking meter", 
        
        "bench", 
        
        "bird", 
        
        "cat", 
        
        "dog", 
        
        "horse", 
        
        "sheep", 
        
        "cow",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "elephant", 
        
        "bear", 
        
        "zebra", 
        
        "giraffe", 
        
        "backpack", 
        
        "umbrella", 
        
        "handbag", 
        
        "tie", 
        
        "suitcase", 
        
        "frisbee",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "skis", 
        
        "snowboard", 
        
        "sports ball", 
        
        "kite", 
        
        "baseball bat", 
        
        "baseball glove", 
        
        "skateboard", 
        
        "surfboard",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "tennis racket", 
        
        "bottle", 
        
        "wine glass", 
        
        "cup", 
        
        "fork", 
        
        "knife", 
        
        "spoon", 
        
        "bowl", 
        
        "banana", 
        
        "apple",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "sandwich", 
        
        "orange", 
        
        "broccoli", 
        
        "carrot", 
        
        "hot dog", 
        
        "pizza", 
        
        "donut", 
        
        "cake", 
        
        "chair", 
        
        "couch",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "potted plant", 
        
        "bed", 
        
        "dining table", 
        
        "toilet", 
        
        "tv", 
        
        "laptop", 
        
        "mouse", 
        
        "remote", 
        
        "keyboard", 
        
        "cell phone",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "microwave", 
        
        "oven", 
        
        "toaster", 
        
        "sink", 
        
        "refrigerator", 
        
        "book", 
        
        "clock", 
        
        "vase", 
        
        "scissors", 
        
        "teddy bear",
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        "hair drier", 
        
        "toothbrush"]
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        # a YoLov5TRT instance
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            yolov5_wrapper = YoLov5TRT(engine_file_path)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
            cap = cv2.VideoCapture(
        
        0)
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        try:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                thread1 = inferThread(yolov5_wrapper)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                thread1.start()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                thread1.join()
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        while 
        
        1:
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    _,frame = cap.read()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    img,t=thread1.infer(frame)
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                    cv2.imshow(
        
        "result", img)
       
       
      
      


      
      
       
       
      
      
      
      
       
                   
        
        if cv2.waitKey(
        
        1) & 
        
        0XFF == 
        
        ord(
        
        'q'):  
        
        # 1 millisecond
       
       
      
      


      
      
       
       
      
      
      
      
       
                       
        
        break
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
        
       
       
      
      


      
      
       
       
      
      
      
      
       
           
        
        finally:
       
       
      
      


      
      
       
       
      
      
      
      
       
               
        
        # destroy the instance
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cap.release()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                cv2.destroyAllWindows()
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
                yolov5_wrapper.destroy()

最后在yolov5env环境中执行yolo_trt_test.py脚本。


    
    

      
      
       
       
      
      
      
      
       
       
        
        conda activate yolov5env
       
       
      
      


      
      
       
       
      
      
      
      
       
       
        
        python yolo_trt_test.py

3、实现效果

（a）未加速 (b)加速

五、总结

TensorRT加速对于深度学习模型在移动嵌入式部署十分重要，解决了一些算力较低的嵌入式设备无法部署深度学习算法或者部署效果差的情况。个人感觉当然使用v5.0的最好，它支持YOLOv5新出的几个模型加速。到此，我使用TensorRT加速yolov5的过程就到此结束，如果有问题可以随时问我，希望得到点赞和关注。翻过一座山又是一座山，下座山峰见。

六、参考文章

https://github.com/wang-xinyu/tensorrtx文章来源地址https://www.toymoban.com/news/detail-685899.html

文章知识点与官方知识档案匹配，可进一步学习相关知识

Python入门技能树人工智能深度学习 268963 人正在系统学习中