Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测

这篇具有很好参考价值的文章主要介绍了Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

link

上一篇:Jetson AGX Xavier安装torch、torchvision且成功运行yolov5算法

下一篇:Jetson AGX Xavier测试YOLOv4

一、前言

        由于YOLOv5在Xavier上对实时画面的检测速度较慢,需要采用TensorRT对其进行推理加速。接下来记录一下我的实现过程。

二、环境准备

 如果还没有搭建YOLOv5的python环境,按照下文步骤执行。反之,直接跳过第一步执行第二步。

1、参考文章《Jetson AGX Xavier配置yolov5虚拟环境》建立YOLOv5的Python环境,并参照《Jetson AGX Xavier安装Archiconda虚拟环境管理器与在虚拟环境中调用opencv》,将opencv导入环境,本文Opencv采用的是3.4.3版本。

2、在环境中导入TensorRT的库。与opencv的导入相同。将路径 /usr/lib/python3.6/dist-packages/  下关于TensorRT的文件夹,复制到自己所创建环境的site-packages文件夹下。例如:复制到/home/jetson/archiconda3/envs/yolov5env/lib/python3.6/site-packages/之下。

Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测,机器视觉,python,opencv,计算机视觉

3、在环境中安装pycuda,如果pip安装不成功,网上有许多解决办法。 


    
    
  1. conda activate yolov 5env
  2. pip install pycuda

三、加速步骤

         以加速YOLOv5s模型为例,以下有v4.0与v5.0两个版本,大家任选其一即可。

1、克隆工程

①v4.0


    
    
  1. git clone -b v4.0 https://github.com/ultralytics/yolov5.git
  2. git clone -b yolov5-v4.0 https://github.com/wang-xinyu/tensorrtx.git

②v5.0


    
    
  1. git clone -b v5.0 https://github.com/ultralytics/yolov5.git
  2. git clone -b yolov5-v5.0 https://github.com/wang-xinyu/tensorrtx.git

2、生成引擎文件

①下载yolov5s.pt到yolov5工程的weights文件夹下。

②复制tensorrtx/yolov5文件夹下的gen_wts.py文件到yolov5工程下。

③生成yolov5s.wts文件。


    
    
  1. conda activate yolov5env
  2. cd /xxx/yolov5
  3. 以下按照自己所下版本选择
  4. #v4.0
  5. python gen_wts.py
  6. #v5.0
  7. python gen_wts.py -w yolov5s.pt -o yolov5s.wts

④生成引擎文件

        进入tensorrtx/yolov5文件夹下。

mkdir build
    
    

        复制yolov5工程中生成的yolov5s.wts文件到tensorrtx/yolov5/build文件夹中。并在build文件夹中打开终端:


    
    
  1. cmake ..
  2. make
  3. #v4.0 sudo ./yolov5 -s [.wts] [.engine] [s/m/l/x/]
  4. #v5.0 sudo ./yolov5 -s [.wts] [.engine] [s/m/l/x/s6/m6/l6/x6 or c/c6 gd gw]
  5. sudo ./yolov5 -s yolov5s.wts yolov5s.engine s

生成yolov5s.engine文件。

四、加速实现

1、图片检测加速


    
    
  1. sudo ./yolov5 -d yolov5s.engine ../samples
  2. 或者
  3. conda activate yolov5env
  4. python yolov5_trt.py

2、摄像头实时检测加速

        由于本人没有学习过C++语言,所以只能硬着头皮修改了下yolov5_trt.py脚本,脚本的代码格式较差,但是能够实现加速,有需要的可以作为一个参考。

        在tensorrt工程下新建一个yolo_trt_test.py文件。复制下面 v4.0或者v5.0的代码到yolo_trt_test.py。注意yolov5s.engine的路径,自行更改。

  ①v4.0代码


    
    
  1. """
  2. An example that uses TensorRT's Python api to make inferences.
  3. """
  4. import ctypes
  5. import os
  6. import random
  7. import sys
  8. import threading
  9. import time
  10. import cv2
  11. import numpy as np
  12. import pycuda.autoinit
  13. import pycuda.driver as cuda
  14. import tensorrt as trt
  15. import torch
  16. import torchvision
  17. INPUT_W = 608
  18. INPUT_H = 608
  19. CONF_THRESH = 0.15
  20. IOU_THRESHOLD = 0.45
  21. int_box=[ 0, 0, 0, 0]
  22. int_box1=[ 0, 0, 0, 0]
  23. fps1= 0.0
  24. def plot_one_box( x, img, color=None, label=None, line_thickness=None):
  25. """
  26. description: Plots one bounding box on image img,
  27. this function comes from YoLov5 project.
  28. param:
  29. x: a box likes [x1,y1,x2,y2]
  30. img: a opencv image object
  31. color: color to draw rectangle, such as (0,255,0)
  32. label: str
  33. line_thickness: int
  34. return:
  35. no return
  36. """
  37. tl = (
  38. line_thickness or round( 0.002 * (img.shape[ 0] + img.shape[ 1]) / 2) + 1
  39. ) # line/font thickness
  40. color = color or [random.randint( 0, 255) for _ in range( 3)]
  41. c1, c2 = ( int(x[ 0]), int(x[ 1])), ( int(x[ 2]), int(x[ 3]))
  42. C2 = c2
  43. cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
  44. if label:
  45. tf = max(tl - 1, 1) # font thickness
  46. t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[ 0]
  47. c2 = c1[ 0] + t_size[ 0], c1[ 1] + t_size[ 1] + 8
  48. cv2.rectangle(img, c1, c2, color, - 1, cv2.LINE_AA) # filled
  49. cv2.putText(
  50. img,
  51. label,
  52. (c1[ 0], c1[ 1]+t_size[ 1] + 5),
  53. 0,
  54. tl / 3,
  55. [ 255, 255, 255],
  56. thickness=tf,
  57. lineType=cv2.LINE_AA,
  58. )
  59. class YoLov5TRT( object):
  60. """
  61. description: A YOLOv5 class that warps TensorRT ops, preprocess and postprocess ops.
  62. """
  63. def __init__( self, engine_file_path):
  64. # Create a Context on this device,
  65. self.cfx = cuda.Device( 0).make_context()
  66. stream = cuda.Stream()
  67. TRT_LOGGER = trt.Logger(trt.Logger.INFO)
  68. runtime = trt.Runtime(TRT_LOGGER)
  69. # Deserialize the engine from file
  70. with open(engine_file_path, "rb") as f:
  71. engine = runtime.deserialize_cuda_engine(f.read())
  72. context = engine.create_execution_context()
  73. host_inputs = []
  74. cuda_inputs = []
  75. host_outputs = []
  76. cuda_outputs = []
  77. bindings = []
  78. for binding in engine:
  79. size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
  80. dtype = trt.nptype(engine.get_binding_dtype(binding))
  81. # Allocate host and device buffers
  82. host_mem = cuda.pagelocked_empty(size, dtype)
  83. cuda_mem = cuda.mem_alloc(host_mem.nbytes)
  84. # Append the device buffer to device bindings.
  85. bindings.append( int(cuda_mem))
  86. # Append to the appropriate list.
  87. if engine.binding_is_input(binding):
  88. host_inputs.append(host_mem)
  89. cuda_inputs.append(cuda_mem)
  90. else:
  91. host_outputs.append(host_mem)
  92. cuda_outputs.append(cuda_mem)
  93. # Store
  94. self.stream = stream
  95. self.context = context
  96. self.engine = engine
  97. self.host_inputs = host_inputs
  98. self.cuda_inputs = cuda_inputs
  99. self.host_outputs = host_outputs
  100. self.cuda_outputs = cuda_outputs
  101. self.bindings = bindings
  102. def infer( self, input_image_path):
  103. global int_box,int_box1,fps1
  104. # threading.Thread.__init__(self)
  105. # Make self the active context, pushing it on top of the context stack.
  106. self.cfx.push()
  107. # Restore
  108. stream = self.stream
  109. context = self.context
  110. engine = self.engine
  111. host_inputs = self.host_inputs
  112. cuda_inputs = self.cuda_inputs
  113. host_outputs = self.host_outputs
  114. cuda_outputs = self.cuda_outputs
  115. bindings = self.bindings
  116. # Do image preprocess
  117. input_image, image_raw, origin_h, origin_w = self.preprocess_image(
  118. input_image_path
  119. )
  120. # Copy input image to host buffer
  121. np.copyto(host_inputs[ 0], input_image.ravel())
  122. # Transfer input data to the GPU.
  123. cuda.memcpy_htod_async(cuda_inputs[ 0], host_inputs[ 0], stream)
  124. # Run inference.
  125. context.execute_async(bindings=bindings, stream_handle=stream.handle)
  126. # Transfer predictions back from the GPU.
  127. cuda.memcpy_dtoh_async(host_outputs[ 0], cuda_outputs[ 0], stream)
  128. # Synchronize the stream
  129. stream.synchronize()
  130. # Remove any context from the top of the context stack, deactivating it.
  131. self.cfx.pop()
  132. # Here we use the first row of output in that batch_size = 1
  133. output = host_outputs[ 0]
  134. # Do postprocess
  135. result_boxes, result_scores, result_classid = self.post_process(
  136. output, origin_h, origin_w
  137. )
  138. # Draw rectangles and labels on the original image
  139. for i in range( len(result_boxes)):
  140. box1 = result_boxes[i]
  141. plot_one_box(
  142. box1,
  143. image_raw,
  144. label= "{}:{:.2f}". format(
  145. categories[ int(result_classid[i])], result_scores[i]
  146. ),
  147. )
  148. return image_raw
  149. # parent, filename = os.path.split(input_image_path)
  150. # save_name = os.path.join(parent, "output_" + filename)
  151. # #  Save image
  152. # cv2.imwrite(save_name, image_raw)
  153. def destroy( self):
  154. # Remove any context from the top of the context stack, deactivating it.
  155. self.cfx.pop()
  156. def preprocess_image( self, input_image_path):
  157. """
  158. description: Read an image from image path, convert it to RGB,
  159. resize and pad it to target size, normalize to [0,1],
  160. transform to NCHW format.
  161. param:
  162. input_image_path: str, image path
  163. return:
  164. image: the processed image
  165. image_raw: the original image
  166. h: original height
  167. w: original width
  168. """
  169. image_raw = input_image_path
  170. h, w, c = image_raw.shape
  171. image = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB)
  172. # Calculate widht and height and paddings
  173. r_w = INPUT_W / w
  174. r_h = INPUT_H / h
  175. if r_h > r_w:
  176. tw = INPUT_W
  177. th = int(r_w * h)
  178. tx1 = tx2 = 0
  179. ty1 = int((INPUT_H - th) / 2)
  180. ty2 = INPUT_H - th - ty1
  181. else:
  182. tw = int(r_h * w)
  183. th = INPUT_H
  184. tx1 = int((INPUT_W - tw) / 2)
  185. tx2 = INPUT_W - tw - tx1
  186. ty1 = ty2 = 0
  187. # Resize the image with long side while maintaining ratio
  188. image = cv2.resize(image, (tw, th))
  189. # Pad the short side with (128,128,128)
  190. image = cv2.copyMakeBorder(
  191. image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, ( 128, 128, 128)
  192. )
  193. image = image.astype(np.float32)
  194. # Normalize to [0,1]
  195. image /= 255.0
  196. # HWC to CHW format:
  197. image = np.transpose(image, [ 2, 0, 1])
  198. # CHW to NCHW format
  199. image = np.expand_dims(image, axis= 0)
  200. # Convert the image to row-major order, also known as "C order":
  201. image = np.ascontiguousarray(image)
  202. return image, image_raw, h, w
  203. def xywh2xyxy( self, origin_h, origin_w, x):
  204. """
  205. description: Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
  206. param:
  207. origin_h: height of original image
  208. origin_w: width of original image
  209. x: A boxes tensor, each row is a box [center_x, center_y, w, h]
  210. return:
  211. y: A boxes tensor, each row is a box [x1, y1, x2, y2]
  212. """
  213. y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)
  214. r_w = INPUT_W / origin_w
  215. r_h = INPUT_H / origin_h
  216. if r_h > r_w:
  217. y[:, 0] = x[:, 0] - x[:, 2] / 2
  218. y[:, 2] = x[:, 0] + x[:, 2] / 2
  219. y[:, 1] = x[:, 1] - x[:, 3] / 2 - (INPUT_H - r_w * origin_h) / 2
  220. y[:, 3] = x[:, 1] + x[:, 3] / 2 - (INPUT_H - r_w * origin_h) / 2
  221. y /= r_w
  222. else:
  223. y[:, 0] = x[:, 0] - x[:, 2] / 2 - (INPUT_W - r_h * origin_w) / 2
  224. y[:, 2] = x[:, 0] + x[:, 2] / 2 - (INPUT_W - r_h * origin_w) / 2
  225. y[:, 1] = x[:, 1] - x[:, 3] / 2
  226. y[:, 3] = x[:, 1] + x[:, 3] / 2
  227. y /= r_h
  228. return y
  229. def post_process( self, output, origin_h, origin_w):
  230. """
  231. description: postprocess the prediction
  232. param:
  233. output: A tensor likes [num_boxes,cx,cy,w,h,conf,cls_id, cx,cy,w,h,conf,cls_id, ...]
  234. origin_h: height of original image
  235. origin_w: width of original image
  236. return:
  237. result_boxes: finally boxes, a boxes tensor, each row is a box [x1, y1, x2, y2]
  238. result_scores: finally scores, a tensor, each element is the score correspoing to box
  239. result_classid: finally classid, a tensor, each element is the classid correspoing to box
  240. """
  241. # Get the num of boxes detected
  242. num = int(output[ 0])
  243. # Reshape to a two dimentional ndarray
  244. pred = np.reshape(output[ 1:], (- 1, 6))[:num, :]
  245. # to a torch Tensor
  246. pred = torch.Tensor(pred).cuda()
  247. # Get the boxes
  248. boxes = pred[:, : 4]
  249. # Get the scores
  250. scores = pred[:, 4]
  251. # Get the classid
  252. classid = pred[:, 5]
  253. # Choose those boxes that score > CONF_THRESH
  254. si = scores > CONF_THRESH
  255. boxes = boxes[si, :]
  256. scores = scores[si]
  257. classid = classid[si]
  258. # Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2]
  259. boxes = self.xywh2xyxy(origin_h, origin_w, boxes)
  260. # Do nms
  261. indices = torchvision.ops.nms(boxes, scores, iou_threshold=IOU_THRESHOLD).cpu()
  262. result_boxes = boxes[indices, :].cpu()
  263. result_scores = scores[indices].cpu()
  264. result_classid = classid[indices].cpu()
  265. return result_boxes, result_scores, result_classid
  266. class myThread(threading.Thread):
  267. def __init__( self, func, args):
  268. threading.Thread.__init__(self)
  269. self.func = func
  270. self.args = args
  271. def run( self):
  272. self.func(*self.args)
  273. if __name__ == "__main__":
  274. # load custom plugins
  275. PLUGIN_LIBRARY = "build/libmyplugins.so"
  276. ctypes.CDLL(PLUGIN_LIBRARY)
  277. engine_file_path = "yolov5s.engine"
  278. # load coco labels
  279. categories = [ "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
  280. "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
  281. "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
  282. "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
  283. "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
  284. "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
  285. "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
  286. "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
  287. "hair drier", "toothbrush"]
  288. # a YoLov5TRT instance
  289. yolov5_wrapper = YoLov5TRT(engine_file_path)
  290. cap = cv2.VideoCapture( 0)
  291. while 1:
  292. _,image =cap.read()
  293. img=yolov5_wrapper.infer(image)
  294. cv2.imshow( "result", img)
  295. if cv2.waitKey( 1) & 0XFF == ord( 'q'): # 1 millisecond
  296. break
  297. cap.release()
  298. cv2.destroyAllWindows()
  299. yolov5_wrapper.destroy()
Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测,机器视觉,python,opencv,计算机视觉

  ②v5.0代码


    
    
  1. """
  2. An example that uses TensorRT's Python api to make inferences.
  3. """
  4. import ctypes
  5. import os
  6. import shutil
  7. import random
  8. import sys
  9. import threading
  10. import time
  11. import cv2
  12. import numpy as np
  13. import pycuda.autoinit
  14. import pycuda.driver as cuda
  15. import tensorrt as trt
  16. import torch
  17. import torchvision
  18. import argparse
  19. CONF_THRESH = 0.5
  20. IOU_THRESHOLD = 0.4
  21. def get_img_path_batches( batch_size, img_dir):
  22. ret = []
  23. batch = []
  24. for root, dirs, files in os.walk(img_dir):
  25. for name in files:
  26. if len(batch) == batch_size:
  27. ret.append(batch)
  28. batch = []
  29. batch.append(os.path.join(root, name))
  30. if len(batch) > 0:
  31. ret.append(batch)
  32. return ret
  33. def plot_one_box( x, img, color=None, label=None, line_thickness=None):
  34. """
  35. description: Plots one bounding box on image img,
  36. this function comes from YoLov5 project.
  37. param:
  38. x: a box likes [x1,y1,x2,y2]
  39. img: a opencv image object
  40. color: color to draw rectangle, such as (0,255,0)
  41. label: str
  42. line_thickness: int
  43. return:
  44. no return
  45. """
  46. tl = (
  47. line_thickness or round( 0.002 * (img.shape[ 0] + img.shape[ 1]) / 2) + 1
  48. ) # line/font thickness
  49. color = color or [random.randint( 0, 255) for _ in range( 3)]
  50. c1, c2 = ( int(x[ 0]), int(x[ 1])), ( int(x[ 2]), int(x[ 3]))
  51. cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
  52. if label:
  53. tf = max(tl - 1, 1) # font thickness
  54. t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[ 0]
  55. c2 = c1[ 0] + t_size[ 0], c1[ 1] - t_size[ 1] - 3
  56. cv2.rectangle(img, c1, c2, color, - 1, cv2.LINE_AA) # filled
  57. cv2.putText(
  58. img,
  59. label,
  60. (c1[ 0], c1[ 1] - 2),
  61. 0,
  62. tl / 3,
  63. [ 225, 255, 255],
  64. thickness=tf,
  65. lineType=cv2.LINE_AA,
  66. )
  67. class YoLov5TRT( object):
  68. """
  69. description: A YOLOv5 class that warps TensorRT ops, preprocess and postprocess ops.
  70. """
  71. def __init__( self, engine_file_path):
  72. # Create a Context on this device,
  73. self.ctx = cuda.Device( 0).make_context()
  74. stream = cuda.Stream()
  75. TRT_LOGGER = trt.Logger(trt.Logger.INFO)
  76. runtime = trt.Runtime(TRT_LOGGER)
  77. # Deserialize the engine from file
  78. with open(engine_file_path, "rb") as f:
  79. engine = runtime.deserialize_cuda_engine(f.read())
  80. context = engine.create_execution_context()
  81. host_inputs = []
  82. cuda_inputs = []
  83. host_outputs = []
  84. cuda_outputs = []
  85. bindings = []
  86. for binding in engine:
  87. print( 'bingding:', binding, engine.get_binding_shape(binding))
  88. size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
  89. dtype = trt.nptype(engine.get_binding_dtype(binding))
  90. # Allocate host and device buffers
  91. host_mem = cuda.pagelocked_empty(size, dtype)
  92. cuda_mem = cuda.mem_alloc(host_mem.nbytes)
  93. # Append the device buffer to device bindings.
  94. bindings.append( int(cuda_mem))
  95. # Append to the appropriate list.
  96. if engine.binding_is_input(binding):
  97. self.input_w = engine.get_binding_shape(binding)[- 1]
  98. self.input_h = engine.get_binding_shape(binding)[- 2]
  99. host_inputs.append(host_mem)
  100. cuda_inputs.append(cuda_mem)
  101. else:
  102. host_outputs.append(host_mem)
  103. cuda_outputs.append(cuda_mem)
  104. # Store
  105. self.stream = stream
  106. self.context = context
  107. self.engine = engine
  108. self.host_inputs = host_inputs
  109. self.cuda_inputs = cuda_inputs
  110. self.host_outputs = host_outputs
  111. self.cuda_outputs = cuda_outputs
  112. self.bindings = bindings
  113. self.batch_size = engine.max_batch_size
  114. def infer( self, input_image_path):
  115. threading.Thread.__init__(self)
  116. # Make self the active context, pushing it on top of the context stack.
  117. self.ctx.push()
  118. self.input_image_path = input_image_path
  119. # Restore
  120. stream = self.stream
  121. context = self.context
  122. engine = self.engine
  123. host_inputs = self.host_inputs
  124. cuda_inputs = self.cuda_inputs
  125. host_outputs = self.host_outputs
  126. cuda_outputs = self.cuda_outputs
  127. bindings = self.bindings
  128. # Do image preprocess
  129. batch_image_raw = []
  130. batch_origin_h = []
  131. batch_origin_w = []
  132. batch_input_image = np.empty(shape=[self.batch_size, 3, self.input_h, self.input_w])
  133. input_image, image_raw, origin_h, origin_w = self.preprocess_image(input_image_path
  134. )
  135. batch_origin_h.append(origin_h)
  136. batch_origin_w.append(origin_w)
  137. np.copyto(batch_input_image, input_image)
  138. batch_input_image = np.ascontiguousarray(batch_input_image)
  139. # Copy input image to host buffer
  140. np.copyto(host_inputs[ 0], batch_input_image.ravel())
  141. start = time.time()
  142. # Transfer input data to the GPU.
  143. cuda.memcpy_htod_async(cuda_inputs[ 0], host_inputs[ 0], stream)
  144. # Run inference.
  145. context.execute_async(batch_size=self.batch_size, bindings=bindings, stream_handle=stream.handle)
  146. # Transfer predictions back from the GPU.
  147. cuda.memcpy_dtoh_async(host_outputs[ 0], cuda_outputs[ 0], stream)
  148. # Synchronize the stream
  149. stream.synchronize()
  150. end = time.time()
  151. # Remove any context from the top of the context stack, deactivating it.
  152. self.ctx.pop()
  153. # Here we use the first row of output in that batch_size = 1
  154. output = host_outputs[ 0]
  155. # Do postprocess
  156. result_boxes, result_scores, result_classid = self.post_process(
  157. output, origin_h, origin_w)
  158. # Draw rectangles and labels on the original image
  159. for j in range( len(result_boxes)):
  160. box = result_boxes[j]
  161. plot_one_box(
  162. box,
  163. image_raw,
  164. label= "{}:{:.2f}". format(
  165. categories[ int(result_classid[j])], result_scores[j]
  166. ),
  167. )
  168. return image_raw, end - start
  169. def destroy( self):
  170. # Remove any context from the top of the context stack, deactivating it.
  171. self.ctx.pop()
  172. def get_raw_image( self, image_path_batch):
  173. """
  174. description: Read an image from image path
  175. """
  176. for img_path in image_path_batch:
  177. yield cv2.imread(img_path)
  178. def get_raw_image_zeros( self, image_path_batch=None):
  179. """
  180. description: Ready data for warmup
  181. """
  182. for _ in range(self.batch_size):
  183. yield np.zeros([self.input_h, self.input_w, 3], dtype=np.uint8)
  184. def preprocess_image( self, input_image_path):
  185. """
  186. description: Convert BGR image to RGB,
  187. resize and pad it to target size, normalize to [0,1],
  188. transform to NCHW format.
  189. param:
  190. input_image_path: str, image path
  191. return:
  192. image: the processed image
  193. image_raw: the original image
  194. h: original height
  195. w: original width
  196. """
  197. image_raw = input_image_path
  198. h, w, c = image_raw.shape
  199. image = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB)
  200. # Calculate widht and height and paddings
  201. r_w = self.input_w / w
  202. r_h = self.input_h / h
  203. if r_h > r_w:
  204. tw = self.input_w
  205. th = int(r_w * h)
  206. tx1 = tx2 = 0
  207. ty1 = int((self.input_h - th) / 2)
  208. ty2 = self.input_h - th - ty1
  209. else:
  210. tw = int(r_h * w)
  211. th = self.input_h
  212. tx1 = int((self.input_w - tw) / 2)
  213. tx2 = self.input_w - tw - tx1
  214. ty1 = ty2 = 0
  215. # Resize the image with long side while maintaining ratio
  216. image = cv2.resize(image, (tw, th))
  217. # Pad the short side with (128,128,128)
  218. image = cv2.copyMakeBorder(
  219. image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, ( 128, 128, 128)
  220. )
  221. image = image.astype(np.float32)
  222. # Normalize to [0,1]
  223. image /= 255.0
  224. # HWC to CHW format:
  225. image = np.transpose(image, [ 2, 0, 1])
  226. # CHW to NCHW format
  227. image = np.expand_dims(image, axis= 0)
  228. # Convert the image to row-major order, also known as "C order":
  229. image = np.ascontiguousarray(image)
  230. return image, image_raw, h, w
  231. def xywh2xyxy( self, origin_h, origin_w, x):
  232. """
  233. description: Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
  234. param:
  235. origin_h: height of original image
  236. origin_w: width of original image
  237. x: A boxes tensor, each row is a box [center_x, center_y, w, h]
  238. return:
  239. y: A boxes tensor, each row is a box [x1, y1, x2, y2]
  240. """
  241. y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)
  242. r_w = self.input_w / origin_w
  243. r_h = self.input_h / origin_h
  244. if r_h > r_w:
  245. y[:, 0] = x[:, 0] - x[:, 2] / 2
  246. y[:, 2] = x[:, 0] + x[:, 2] / 2
  247. y[:, 1] = x[:, 1] - x[:, 3] / 2 - (self.input_h - r_w * origin_h) / 2
  248. y[:, 3] = x[:, 1] + x[:, 3] / 2 - (self.input_h - r_w * origin_h) / 2
  249. y /= r_w
  250. else:
  251. y[:, 0] = x[:, 0] - x[:, 2] / 2 - (self.input_w - r_h * origin_w) / 2
  252. y[:, 2] = x[:, 0] + x[:, 2] / 2 - (self.input_w - r_h * origin_w) / 2
  253. y[:, 1] = x[:, 1] - x[:, 3] / 2
  254. y[:, 3] = x[:, 1] + x[:, 3] / 2
  255. y /= r_h
  256. return y
  257. def post_process( self, output, origin_h, origin_w):
  258. """
  259. description: postprocess the prediction
  260. param:
  261. output: A tensor likes [num_boxes,cx,cy,w,h,conf,cls_id, cx,cy,w,h,conf,cls_id, ...]
  262. origin_h: height of original image
  263. origin_w: width of original image
  264. return:
  265. result_boxes: finally boxes, a boxes tensor, each row is a box [x1, y1, x2, y2]
  266. result_scores: finally scores, a tensor, each element is the score correspoing to box
  267. result_classid: finally classid, a tensor, each element is the classid correspoing to box
  268. """
  269. # Get the num of boxes detected
  270. num = int(output[ 0])
  271. # Reshape to a two dimentional ndarray
  272. pred = np.reshape(output[ 1:], (- 1, 6))[:num, :]
  273. # to a torch Tensor
  274. pred = torch.Tensor(pred).cuda()
  275. # Get the boxes
  276. boxes = pred[:, : 4]
  277. # Get the scores
  278. scores = pred[:, 4]
  279. # Get the classid
  280. classid = pred[:, 5]
  281. # Choose those boxes that score > CONF_THRESH
  282. si = scores > CONF_THRESH
  283. boxes = boxes[si, :]
  284. scores = scores[si]
  285. classid = classid[si]
  286. # Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2]
  287. boxes = self.xywh2xyxy(origin_h, origin_w, boxes)
  288. # Do nms
  289. indices = torchvision.ops.nms(boxes, scores, iou_threshold=IOU_THRESHOLD).cpu()
  290. result_boxes = boxes[indices, :].cpu()
  291. result_scores = scores[indices].cpu()
  292. result_classid = classid[indices].cpu()
  293. return result_boxes, result_scores, result_classid
  294. class inferThread(threading.Thread):
  295. def __init__( self, yolov5_wrapper):
  296. threading.Thread.__init__(self)
  297. self.yolov5_wrapper = yolov5_wrapper
  298. def infer( self , frame):
  299. batch_image_raw, use_time = self.yolov5_wrapper.infer(frame)
  300. # for i, img_path in enumerate(self.image_path_batch):
  301. # parent, filename = os.path.split(img_path)
  302. # save_name = os.path.join('output', filename)
  303. # # Save image
  304. # cv2.imwrite(save_name, batch_image_raw[i])
  305. # print('input->{}, time->{:.2f}ms, saving into output/'.format(self.image_path_batch, use_time * 1000))
  306. return batch_image_raw,use_time
  307. class warmUpThread(threading.Thread):
  308. def __init__( self, yolov5_wrapper):
  309. threading.Thread.__init__(self)
  310. self.yolov5_wrapper = yolov5_wrapper
  311. def run( self):
  312. batch_image_raw, use_time = self.yolov5_wrapper.infer(self.yolov5_wrapper.get_raw_image_zeros())
  313. print( 'warm_up->{}, time->{:.2f}ms'. format(batch_image_raw[ 0].shape, use_time * 1000))
  314. if __name__ == "__main__":
  315. # load custom plugins
  316. parser = argparse.ArgumentParser()
  317. parser.add_argument( '--engine', nargs= '+', type= str, default= "build/yolov5s.engine", help= '.engine path(s)')
  318. parser.add_argument( '--save', type= int, default= 0, help= 'save?')
  319. opt = parser.parse_args()
  320. PLUGIN_LIBRARY = "build/libmyplugins.so"
  321. engine_file_path = opt.engine
  322. ctypes.CDLL(PLUGIN_LIBRARY)
  323. # load coco labels
  324. categories = [ "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
  325. "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
  326. "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
  327. "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
  328. "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
  329. "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
  330. "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
  331. "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
  332. "hair drier", "toothbrush"]
  333. # a YoLov5TRT instance
  334. yolov5_wrapper = YoLov5TRT(engine_file_path)
  335. cap = cv2.VideoCapture( 0)
  336. try:
  337. thread1 = inferThread(yolov5_wrapper)
  338. thread1.start()
  339. thread1.join()
  340. while 1:
  341. _,frame = cap.read()
  342. img,t=thread1.infer(frame)
  343. cv2.imshow( "result", img)
  344. if cv2.waitKey( 1) & 0XFF == ord( 'q'): # 1 millisecond
  345. break
  346. finally:
  347. # destroy the instance
  348. cap.release()
  349. cv2.destroyAllWindows()
  350. yolov5_wrapper.destroy()

          最后在yolov5env环境中执行yolo_trt_test.py脚本。


    
    
  1. conda activate yolov5env
  2. python yolo_trt_test.py

 3、实现效果

    Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测,机器视觉,python,opencv,计算机视觉Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测,机器视觉,python,opencv,计算机视觉

                          (a)未加速                                                         (b)加速                                                                        

五、总结

       TensorRT加速对于深度学习模型在移动嵌入式部署十分重要,解决了一些算力较低的嵌入式设备无法部署深度学习算法或者部署效果差的情况。个人感觉当然使用v5.0的最好,它支持YOLOv5新出的几个模型加速。到此,我使用TensorRT加速yolov5的过程就到此结束,如果有问题可以随时问我,希望得到点赞和关注。翻过一座山又是一座山,下座山峰见。

六、参考文章

https://github.com/wang-xinyu/tensorrtx文章来源地址https://www.toymoban.com/news/detail-685899.html

文章知识点与官方知识档案匹配,可进一步学习相关知识
Python入门技能树人工智能深度学习 268963 人正在系统学习中

上一篇:Jetson AGX Xavier安装torch、torchvision且成功运行yolov5算法

下一篇:Jetson AGX Xavier测试YOLOv4

一、前言

        由于YOLOv5在Xavier上对实时画面的检测速度较慢,需要采用TensorRT对其进行推理加速。接下来记录一下我的实现过程。

二、环境准备

 如果还没有搭建YOLOv5的python环境,按照下文步骤执行。反之,直接跳过第一步执行第二步。

1、参考文章《Jetson AGX Xavier配置yolov5虚拟环境》建立YOLOv5的Python环境,并参照《Jetson AGX Xavier安装Archiconda虚拟环境管理器与在虚拟环境中调用opencv》,将opencv导入环境,本文Opencv采用的是3.4.3版本。

2、在环境中导入TensorRT的库。与opencv的导入相同。将路径 /usr/lib/python3.6/dist-packages/  下关于TensorRT的文件夹,复制到自己所创建环境的site-packages文件夹下。例如:复制到/home/jetson/archiconda3/envs/yolov5env/lib/python3.6/site-packages/之下。

Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测,机器视觉,python,opencv,计算机视觉

3、在环境中安装pycuda,如果pip安装不成功,网上有许多解决办法。 


    
    
  1. conda activate yolov 5env
  2. pip install pycuda

三、加速步骤

         以加速YOLOv5s模型为例,以下有v4.0与v5.0两个版本,大家任选其一即可。

1、克隆工程

①v4.0


    
    
  1. git clone -b v4.0 https://github.com/ultralytics/yolov5.git
  2. git clone -b yolov5-v4.0 https://github.com/wang-xinyu/tensorrtx.git

②v5.0


    
    
  1. git clone -b v5.0 https://github.com/ultralytics/yolov5.git
  2. git clone -b yolov5-v5.0 https://github.com/wang-xinyu/tensorrtx.git

2、生成引擎文件

①下载yolov5s.pt到yolov5工程的weights文件夹下。

②复制tensorrtx/yolov5文件夹下的gen_wts.py文件到yolov5工程下。

③生成yolov5s.wts文件。


    
    
  1. conda activate yolov5env
  2. cd /xxx/yolov5
  3. 以下按照自己所下版本选择
  4. #v4.0
  5. python gen_wts.py
  6. #v5.0
  7. python gen_wts.py -w yolov5s.pt -o yolov5s.wts

④生成引擎文件

        进入tensorrtx/yolov5文件夹下。

mkdir build
    
    

        复制yolov5工程中生成的yolov5s.wts文件到tensorrtx/yolov5/build文件夹中。并在build文件夹中打开终端:


    
    
  1. cmake ..
  2. make
  3. #v4.0 sudo ./yolov5 -s [.wts] [.engine] [s/m/l/x/]
  4. #v5.0 sudo ./yolov5 -s [.wts] [.engine] [s/m/l/x/s6/m6/l6/x6 or c/c6 gd gw]
  5. sudo ./yolov5 -s yolov5s.wts yolov5s.engine s

生成yolov5s.engine文件。

四、加速实现

1、图片检测加速


    
    
  1. sudo ./yolov5 -d yolov5s.engine ../samples
  2. 或者
  3. conda activate yolov5env
  4. python yolov5_trt.py

2、摄像头实时检测加速

        由于本人没有学习过C++语言,所以只能硬着头皮修改了下yolov5_trt.py脚本,脚本的代码格式较差,但是能够实现加速,有需要的可以作为一个参考。

        在tensorrt工程下新建一个yolo_trt_test.py文件。复制下面 v4.0或者v5.0的代码到yolo_trt_test.py。注意yolov5s.engine的路径,自行更改。

  ①v4.0代码


    
    
  1. """
  2. An example that uses TensorRT's Python api to make inferences.
  3. """
  4. import ctypes
  5. import os
  6. import random
  7. import sys
  8. import threading
  9. import time
  10. import cv2
  11. import numpy as np
  12. import pycuda.autoinit
  13. import pycuda.driver as cuda
  14. import tensorrt as trt
  15. import torch
  16. import torchvision
  17. INPUT_W = 608
  18. INPUT_H = 608
  19. CONF_THRESH = 0.15
  20. IOU_THRESHOLD = 0.45
  21. int_box=[ 0, 0, 0, 0]
  22. int_box1=[ 0, 0, 0, 0]
  23. fps1= 0.0
  24. def plot_one_box( x, img, color=None, label=None, line_thickness=None):
  25. """
  26. description: Plots one bounding box on image img,
  27. this function comes from YoLov5 project.
  28. param:
  29. x: a box likes [x1,y1,x2,y2]
  30. img: a opencv image object
  31. color: color to draw rectangle, such as (0,255,0)
  32. label: str
  33. line_thickness: int
  34. return:
  35. no return
  36. """
  37. tl = (
  38. line_thickness or round( 0.002 * (img.shape[ 0] + img.shape[ 1]) / 2) + 1
  39. ) # line/font thickness
  40. color = color or [random.randint( 0, 255) for _ in range( 3)]
  41. c1, c2 = ( int(x[ 0]), int(x[ 1])), ( int(x[ 2]), int(x[ 3]))
  42. C2 = c2
  43. cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
  44. if label:
  45. tf = max(tl - 1, 1) # font thickness
  46. t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[ 0]
  47. c2 = c1[ 0] + t_size[ 0], c1[ 1] + t_size[ 1] + 8
  48. cv2.rectangle(img, c1, c2, color, - 1, cv2.LINE_AA) # filled
  49. cv2.putText(
  50. img,
  51. label,
  52. (c1[ 0], c1[ 1]+t_size[ 1] + 5),
  53. 0,
  54. tl / 3,
  55. [ 255, 255, 255],
  56. thickness=tf,
  57. lineType=cv2.LINE_AA,
  58. )
  59. class YoLov5TRT( object):
  60. """
  61. description: A YOLOv5 class that warps TensorRT ops, preprocess and postprocess ops.
  62. """
  63. def __init__( self, engine_file_path):
  64. # Create a Context on this device,
  65. self.cfx = cuda.Device( 0).make_context()
  66. stream = cuda.Stream()
  67. TRT_LOGGER = trt.Logger(trt.Logger.INFO)
  68. runtime = trt.Runtime(TRT_LOGGER)
  69. # Deserialize the engine from file
  70. with open(engine_file_path, "rb") as f:
  71. engine = runtime.deserialize_cuda_engine(f.read())
  72. context = engine.create_execution_context()
  73. host_inputs = []
  74. cuda_inputs = []
  75. host_outputs = []
  76. cuda_outputs = []
  77. bindings = []
  78. for binding in engine:
  79. size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
  80. dtype = trt.nptype(engine.get_binding_dtype(binding))
  81. # Allocate host and device buffers
  82. host_mem = cuda.pagelocked_empty(size, dtype)
  83. cuda_mem = cuda.mem_alloc(host_mem.nbytes)
  84. # Append the device buffer to device bindings.
  85. bindings.append( int(cuda_mem))
  86. # Append to the appropriate list.
  87. if engine.binding_is_input(binding):
  88. host_inputs.append(host_mem)
  89. cuda_inputs.append(cuda_mem)
  90. else:
  91. host_outputs.append(host_mem)
  92. cuda_outputs.append(cuda_mem)
  93. # Store
  94. self.stream = stream
  95. self.context = context
  96. self.engine = engine
  97. self.host_inputs = host_inputs
  98. self.cuda_inputs = cuda_inputs
  99. self.host_outputs = host_outputs
  100. self.cuda_outputs = cuda_outputs
  101. self.bindings = bindings
  102. def infer( self, input_image_path):
  103. global int_box,int_box1,fps1
  104. # threading.Thread.__init__(self)
  105. # Make self the active context, pushing it on top of the context stack.
  106. self.cfx.push()
  107. # Restore
  108. stream = self.stream
  109. context = self.context
  110. engine = self.engine
  111. host_inputs = self.host_inputs
  112. cuda_inputs = self.cuda_inputs
  113. host_outputs = self.host_outputs
  114. cuda_outputs = self.cuda_outputs
  115. bindings = self.bindings
  116. # Do image preprocess
  117. input_image, image_raw, origin_h, origin_w = self.preprocess_image(
  118. input_image_path
  119. )
  120. # Copy input image to host buffer
  121. np.copyto(host_inputs[ 0], input_image.ravel())
  122. # Transfer input data to the GPU.
  123. cuda.memcpy_htod_async(cuda_inputs[ 0], host_inputs[ 0], stream)
  124. # Run inference.
  125. context.execute_async(bindings=bindings, stream_handle=stream.handle)
  126. # Transfer predictions back from the GPU.
  127. cuda.memcpy_dtoh_async(host_outputs[ 0], cuda_outputs[ 0], stream)
  128. # Synchronize the stream
  129. stream.synchronize()
  130. # Remove any context from the top of the context stack, deactivating it.
  131. self.cfx.pop()
  132. # Here we use the first row of output in that batch_size = 1
  133. output = host_outputs[ 0]
  134. # Do postprocess
  135. result_boxes, result_scores, result_classid = self.post_process(
  136. output, origin_h, origin_w
  137. )
  138. # Draw rectangles and labels on the original image
  139. for i in range( len(result_boxes)):
  140. box1 = result_boxes[i]
  141. plot_one_box(
  142. box1,
  143. image_raw,
  144. label= "{}:{:.2f}". format(
  145. categories[ int(result_classid[i])], result_scores[i]
  146. ),
  147. )
  148. return image_raw
  149. # parent, filename = os.path.split(input_image_path)
  150. # save_name = os.path.join(parent, "output_" + filename)
  151. # #  Save image
  152. # cv2.imwrite(save_name, image_raw)
  153. def destroy( self):
  154. # Remove any context from the top of the context stack, deactivating it.
  155. self.cfx.pop()
  156. def preprocess_image( self, input_image_path):
  157. """
  158. description: Read an image from image path, convert it to RGB,
  159. resize and pad it to target size, normalize to [0,1],
  160. transform to NCHW format.
  161. param:
  162. input_image_path: str, image path
  163. return:
  164. image: the processed image
  165. image_raw: the original image
  166. h: original height
  167. w: original width
  168. """
  169. image_raw = input_image_path
  170. h, w, c = image_raw.shape
  171. image = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB)
  172. # Calculate widht and height and paddings
  173. r_w = INPUT_W / w
  174. r_h = INPUT_H / h
  175. if r_h > r_w:
  176. tw = INPUT_W
  177. th = int(r_w * h)
  178. tx1 = tx2 = 0
  179. ty1 = int((INPUT_H - th) / 2)
  180. ty2 = INPUT_H - th - ty1
  181. else:
  182. tw = int(r_h * w)
  183. th = INPUT_H
  184. tx1 = int((INPUT_W - tw) / 2)
  185. tx2 = INPUT_W - tw - tx1
  186. ty1 = ty2 = 0
  187. # Resize the image with long side while maintaining ratio
  188. image = cv2.resize(image, (tw, th))
  189. # Pad the short side with (128,128,128)
  190. image = cv2.copyMakeBorder(
  191. image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, ( 128, 128, 128)
  192. )
  193. image = image.astype(np.float32)
  194. # Normalize to [0,1]
  195. image /= 255.0
  196. # HWC to CHW format:
  197. image = np.transpose(image, [ 2, 0, 1])
  198. # CHW to NCHW format
  199. image = np.expand_dims(image, axis= 0)
  200. # Convert the image to row-major order, also known as "C order":
  201. image = np.ascontiguousarray(image)
  202. return image, image_raw, h, w
  203. def xywh2xyxy( self, origin_h, origin_w, x):
  204. """
  205. description: Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
  206. param:
  207. origin_h: height of original image
  208. origin_w: width of original image
  209. x: A boxes tensor, each row is a box [center_x, center_y, w, h]
  210. return:
  211. y: A boxes tensor, each row is a box [x1, y1, x2, y2]
  212. """
  213. y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)
  214. r_w = INPUT_W / origin_w
  215. r_h = INPUT_H / origin_h
  216. if r_h > r_w:
  217. y[:, 0] = x[:, 0] - x[:, 2] / 2
  218. y[:, 2] = x[:, 0] + x[:, 2] / 2
  219. y[:, 1] = x[:, 1] - x[:, 3] / 2 - (INPUT_H - r_w * origin_h) / 2
  220. y[:, 3] = x[:, 1] + x[:, 3] / 2 - (INPUT_H - r_w * origin_h) / 2
  221. y /= r_w
  222. else:
  223. y[:, 0] = x[:, 0] - x[:, 2] / 2 - (INPUT_W - r_h * origin_w) / 2
  224. y[:, 2] = x[:, 0] + x[:, 2] / 2 - (INPUT_W - r_h * origin_w) / 2
  225. y[:, 1] = x[:, 1] - x[:, 3] / 2
  226. y[:, 3] = x[:, 1] + x[:, 3] / 2
  227. y /= r_h
  228. return y
  229. def post_process( self, output, origin_h, origin_w):
  230. """
  231. description: postprocess the prediction
  232. param:
  233. output: A tensor likes [num_boxes,cx,cy,w,h,conf,cls_id, cx,cy,w,h,conf,cls_id, ...]
  234. origin_h: height of original image
  235. origin_w: width of original image
  236. return:
  237. result_boxes: finally boxes, a boxes tensor, each row is a box [x1, y1, x2, y2]
  238. result_scores: finally scores, a tensor, each element is the score correspoing to box
  239. result_classid: finally classid, a tensor, each element is the classid correspoing to box
  240. """
  241. # Get the num of boxes detected
  242. num = int(output[ 0])
  243. # Reshape to a two dimentional ndarray
  244. pred = np.reshape(output[ 1:], (- 1, 6))[:num, :]
  245. # to a torch Tensor
  246. pred = torch.Tensor(pred).cuda()
  247. # Get the boxes
  248. boxes = pred[:, : 4]
  249. # Get the scores
  250. scores = pred[:, 4]
  251. # Get the classid
  252. classid = pred[:, 5]
  253. # Choose those boxes that score > CONF_THRESH
  254. si = scores > CONF_THRESH
  255. boxes = boxes[si, :]
  256. scores = scores[si]
  257. classid = classid[si]
  258. # Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2]
  259. boxes = self.xywh2xyxy(origin_h, origin_w, boxes)
  260. # Do nms
  261. indices = torchvision.ops.nms(boxes, scores, iou_threshold=IOU_THRESHOLD).cpu()
  262. result_boxes = boxes[indices, :].cpu()
  263. result_scores = scores[indices].cpu()
  264. result_classid = classid[indices].cpu()
  265. return result_boxes, result_scores, result_classid
  266. class myThread(threading.Thread):
  267. def __init__( self, func, args):
  268. threading.Thread.__init__(self)
  269. self.func = func
  270. self.args = args
  271. def run( self):
  272. self.func(*self.args)
  273. if __name__ == "__main__":
  274. # load custom plugins
  275. PLUGIN_LIBRARY = "build/libmyplugins.so"
  276. ctypes.CDLL(PLUGIN_LIBRARY)
  277. engine_file_path = "yolov5s.engine"
  278. # load coco labels
  279. categories = [ "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
  280. "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
  281. "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
  282. "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
  283. "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
  284. "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
  285. "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
  286. "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
  287. "hair drier", "toothbrush"]
  288. # a YoLov5TRT instance
  289. yolov5_wrapper = YoLov5TRT(engine_file_path)
  290. cap = cv2.VideoCapture( 0)
  291. while 1:
  292. _,image =cap.read()
  293. img=yolov5_wrapper.infer(image)
  294. cv2.imshow( "result", img)
  295. if cv2.waitKey( 1) & 0XFF == ord( 'q'): # 1 millisecond
  296. break
  297. cap.release()
  298. cv2.destroyAllWindows()
  299. yolov5_wrapper.destroy()
Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测,机器视觉,python,opencv,计算机视觉

  ②v5.0代码


    
    
  1. """
  2. An example that uses TensorRT's Python api to make inferences.
  3. """
  4. import ctypes
  5. import os
  6. import shutil
  7. import random
  8. import sys
  9. import threading
  10. import time
  11. import cv2
  12. import numpy as np
  13. import pycuda.autoinit
  14. import pycuda.driver as cuda
  15. import tensorrt as trt
  16. import torch
  17. import torchvision
  18. import argparse
  19. CONF_THRESH = 0.5
  20. IOU_THRESHOLD = 0.4
  21. def get_img_path_batches( batch_size, img_dir):
  22. ret = []
  23. batch = []
  24. for root, dirs, files in os.walk(img_dir):
  25. for name in files:
  26. if len(batch) == batch_size:
  27. ret.append(batch)
  28. batch = []
  29. batch.append(os.path.join(root, name))
  30. if len(batch) > 0:
  31. ret.append(batch)
  32. return ret
  33. def plot_one_box( x, img, color=None, label=None, line_thickness=None):
  34. """
  35. description: Plots one bounding box on image img,
  36. this function comes from YoLov5 project.
  37. param:
  38. x: a box likes [x1,y1,x2,y2]
  39. img: a opencv image object
  40. color: color to draw rectangle, such as (0,255,0)
  41. label: str
  42. line_thickness: int
  43. return:
  44. no return
  45. """
  46. tl = (
  47. line_thickness or round( 0.002 * (img.shape[ 0] + img.shape[ 1]) / 2) + 1
  48. ) # line/font thickness
  49. color = color or [random.randint( 0, 255) for _ in range( 3)]
  50. c1, c2 = ( int(x[ 0]), int(x[ 1])), ( int(x[ 2]), int(x[ 3]))
  51. cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
  52. if label:
  53. tf = max(tl - 1, 1) # font thickness
  54. t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[ 0]
  55. c2 = c1[ 0] + t_size[ 0], c1[ 1] - t_size[ 1] - 3
  56. cv2.rectangle(img, c1, c2, color, - 1, cv2.LINE_AA) # filled
  57. cv2.putText(
  58. img,
  59. label,
  60. (c1[ 0], c1[ 1] - 2),
  61. 0,
  62. tl / 3,
  63. [ 225, 255, 255],
  64. thickness=tf,
  65. lineType=cv2.LINE_AA,
  66. )
  67. class YoLov5TRT( object):
  68. """
  69. description: A YOLOv5 class that warps TensorRT ops, preprocess and postprocess ops.
  70. """
  71. def __init__( self, engine_file_path):
  72. # Create a Context on this device,
  73. self.ctx = cuda.Device( 0).make_context()
  74. stream = cuda.Stream()
  75. TRT_LOGGER = trt.Logger(trt.Logger.INFO)
  76. runtime = trt.Runtime(TRT_LOGGER)
  77. # Deserialize the engine from file
  78. with open(engine_file_path, "rb") as f:
  79. engine = runtime.deserialize_cuda_engine(f.read())
  80. context = engine.create_execution_context()
  81. host_inputs = []
  82. cuda_inputs = []
  83. host_outputs = []
  84. cuda_outputs = []
  85. bindings = []
  86. for binding in engine:
  87. print( 'bingding:', binding, engine.get_binding_shape(binding))
  88. size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
  89. dtype = trt.nptype(engine.get_binding_dtype(binding))
  90. # Allocate host and device buffers
  91. host_mem = cuda.pagelocked_empty(size, dtype)
  92. cuda_mem = cuda.mem_alloc(host_mem.nbytes)
  93. # Append the device buffer to device bindings.
  94. bindings.append( int(cuda_mem))
  95. # Append to the appropriate list.
  96. if engine.binding_is_input(binding):
  97. self.input_w = engine.get_binding_shape(binding)[- 1]
  98. self.input_h = engine.get_binding_shape(binding)[- 2]
  99. host_inputs.append(host_mem)
  100. cuda_inputs.append(cuda_mem)
  101. else:
  102. host_outputs.append(host_mem)
  103. cuda_outputs.append(cuda_mem)
  104. # Store
  105. self.stream = stream
  106. self.context = context
  107. self.engine = engine
  108. self.host_inputs = host_inputs
  109. self.cuda_inputs = cuda_inputs
  110. self.host_outputs = host_outputs
  111. self.cuda_outputs = cuda_outputs
  112. self.bindings = bindings
  113. self.batch_size = engine.max_batch_size
  114. def infer( self, input_image_path):
  115. threading.Thread.__init__(self)
  116. # Make self the active context, pushing it on top of the context stack.
  117. self.ctx.push()
  118. self.input_image_path = input_image_path
  119. # Restore
  120. stream = self.stream
  121. context = self.context
  122. engine = self.engine
  123. host_inputs = self.host_inputs
  124. cuda_inputs = self.cuda_inputs
  125. host_outputs = self.host_outputs
  126. cuda_outputs = self.cuda_outputs
  127. bindings = self.bindings
  128. # Do image preprocess
  129. batch_image_raw = []
  130. batch_origin_h = []
  131. batch_origin_w = []
  132. batch_input_image = np.empty(shape=[self.batch_size, 3, self.input_h, self.input_w])
  133. input_image, image_raw, origin_h, origin_w = self.preprocess_image(input_image_path
  134. )
  135. batch_origin_h.append(origin_h)
  136. batch_origin_w.append(origin_w)
  137. np.copyto(batch_input_image, input_image)
  138. batch_input_image = np.ascontiguousarray(batch_input_image)
  139. # Copy input image to host buffer
  140. np.copyto(host_inputs[ 0], batch_input_image.ravel())
  141. start = time.time()
  142. # Transfer input data to the GPU.
  143. cuda.memcpy_htod_async(cuda_inputs[ 0], host_inputs[ 0], stream)
  144. # Run inference.
  145. context.execute_async(batch_size=self.batch_size, bindings=bindings, stream_handle=stream.handle)
  146. # Transfer predictions back from the GPU.
  147. cuda.memcpy_dtoh_async(host_outputs[ 0], cuda_outputs[ 0], stream)
  148. # Synchronize the stream
  149. stream.synchronize()
  150. end = time.time()
  151. # Remove any context from the top of the context stack, deactivating it.
  152. self.ctx.pop()
  153. # Here we use the first row of output in that batch_size = 1
  154. output = host_outputs[ 0]
  155. # Do postprocess
  156. result_boxes, result_scores, result_classid = self.post_process(
  157. output, origin_h, origin_w)
  158. # Draw rectangles and labels on the original image
  159. for j in range( len(result_boxes)):
  160. box = result_boxes[j]
  161. plot_one_box(
  162. box,
  163. image_raw,
  164. label= "{}:{:.2f}". format(
  165. categories[ int(result_classid[j])], result_scores[j]
  166. ),
  167. )
  168. return image_raw, end - start
  169. def destroy( self):
  170. # Remove any context from the top of the context stack, deactivating it.
  171. self.ctx.pop()
  172. def get_raw_image( self, image_path_batch):
  173. """
  174. description: Read an image from image path
  175. """
  176. for img_path in image_path_batch:
  177. yield cv2.imread(img_path)
  178. def get_raw_image_zeros( self, image_path_batch=None):
  179. """
  180. description: Ready data for warmup
  181. """
  182. for _ in range(self.batch_size):
  183. yield np.zeros([self.input_h, self.input_w, 3], dtype=np.uint8)
  184. def preprocess_image( self, input_image_path):
  185. """
  186. description: Convert BGR image to RGB,
  187. resize and pad it to target size, normalize to [0,1],
  188. transform to NCHW format.
  189. param:
  190. input_image_path: str, image path
  191. return:
  192. image: the processed image
  193. image_raw: the original image
  194. h: original height
  195. w: original width
  196. """
  197. image_raw = input_image_path
  198. h, w, c = image_raw.shape
  199. image = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB)
  200. # Calculate widht and height and paddings
  201. r_w = self.input_w / w
  202. r_h = self.input_h / h
  203. if r_h > r_w:
  204. tw = self.input_w
  205. th = int(r_w * h)
  206. tx1 = tx2 = 0
  207. ty1 = int((self.input_h - th) / 2)
  208. ty2 = self.input_h - th - ty1
  209. else:
  210. tw = int(r_h * w)
  211. th = self.input_h
  212. tx1 = int((self.input_w - tw) / 2)
  213. tx2 = self.input_w - tw - tx1
  214. ty1 = ty2 = 0
  215. # Resize the image with long side while maintaining ratio
  216. image = cv2.resize(image, (tw, th))
  217. # Pad the short side with (128,128,128)
  218. image = cv2.copyMakeBorder(
  219. image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, ( 128, 128, 128)
  220. )
  221. image = image.astype(np.float32)
  222. # Normalize to [0,1]
  223. image /= 255.0
  224. # HWC to CHW format:
  225. image = np.transpose(image, [ 2, 0, 1])
  226. # CHW to NCHW format
  227. image = np.expand_dims(image, axis= 0)
  228. # Convert the image to row-major order, also known as "C order":
  229. image = np.ascontiguousarray(image)
  230. return image, image_raw, h, w
  231. def xywh2xyxy( self, origin_h, origin_w, x):
  232. """
  233. description: Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
  234. param:
  235. origin_h: height of original image
  236. origin_w: width of original image
  237. x: A boxes tensor, each row is a box [center_x, center_y, w, h]
  238. return:
  239. y: A boxes tensor, each row is a box [x1, y1, x2, y2]
  240. """
  241. y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)
  242. r_w = self.input_w / origin_w
  243. r_h = self.input_h / origin_h
  244. if r_h > r_w:
  245. y[:, 0] = x[:, 0] - x[:, 2] / 2
  246. y[:, 2] = x[:, 0] + x[:, 2] / 2
  247. y[:, 1] = x[:, 1] - x[:, 3] / 2 - (self.input_h - r_w * origin_h) / 2
  248. y[:, 3] = x[:, 1] + x[:, 3] / 2 - (self.input_h - r_w * origin_h) / 2
  249. y /= r_w
  250. else:
  251. y[:, 0] = x[:, 0] - x[:, 2] / 2 - (self.input_w - r_h * origin_w) / 2
  252. y[:, 2] = x[:, 0] + x[:, 2] / 2 - (self.input_w - r_h * origin_w) / 2
  253. y[:, 1] = x[:, 1] - x[:, 3] / 2
  254. y[:, 3] = x[:, 1] + x[:, 3] / 2
  255. y /= r_h
  256. return y
  257. def post_process( self, output, origin_h, origin_w):
  258. """
  259. description: postprocess the prediction
  260. param:
  261. output: A tensor likes [num_boxes,cx,cy,w,h,conf,cls_id, cx,cy,w,h,conf,cls_id, ...]
  262. origin_h: height of original image
  263. origin_w: width of original image
  264. return:
  265. result_boxes: finally boxes, a boxes tensor, each row is a box [x1, y1, x2, y2]
  266. result_scores: finally scores, a tensor, each element is the score correspoing to box
  267. result_classid: finally classid, a tensor, each element is the classid correspoing to box
  268. """
  269. # Get the num of boxes detected
  270. num = int(output[ 0])
  271. # Reshape to a two dimentional ndarray
  272. pred = np.reshape(output[ 1:], (- 1, 6))[:num, :]
  273. # to a torch Tensor
  274. pred = torch.Tensor(pred).cuda()
  275. # Get the boxes
  276. boxes = pred[:, : 4]
  277. # Get the scores
  278. scores = pred[:, 4]
  279. # Get the classid
  280. classid = pred[:, 5]
  281. # Choose those boxes that score > CONF_THRESH
  282. si = scores > CONF_THRESH
  283. boxes = boxes[si, :]
  284. scores = scores[si]
  285. classid = classid[si]
  286. # Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2]
  287. boxes = self.xywh2xyxy(origin_h, origin_w, boxes)
  288. # Do nms
  289. indices = torchvision.ops.nms(boxes, scores, iou_threshold=IOU_THRESHOLD).cpu()
  290. result_boxes = boxes[indices, :].cpu()
  291. result_scores = scores[indices].cpu()
  292. result_classid = classid[indices].cpu()
  293. return result_boxes, result_scores, result_classid
  294. class inferThread(threading.Thread):
  295. def __init__( self, yolov5_wrapper):
  296. threading.Thread.__init__(self)
  297. self.yolov5_wrapper = yolov5_wrapper
  298. def infer( self , frame):
  299. batch_image_raw, use_time = self.yolov5_wrapper.infer(frame)
  300. # for i, img_path in enumerate(self.image_path_batch):
  301. # parent, filename = os.path.split(img_path)
  302. # save_name = os.path.join('output', filename)
  303. # # Save image
  304. # cv2.imwrite(save_name, batch_image_raw[i])
  305. # print('input->{}, time->{:.2f}ms, saving into output/'.format(self.image_path_batch, use_time * 1000))
  306. return batch_image_raw,use_time
  307. class warmUpThread(threading.Thread):
  308. def __init__( self, yolov5_wrapper):
  309. threading.Thread.__init__(self)
  310. self.yolov5_wrapper = yolov5_wrapper
  311. def run( self):
  312. batch_image_raw, use_time = self.yolov5_wrapper.infer(self.yolov5_wrapper.get_raw_image_zeros())
  313. print( 'warm_up->{}, time->{:.2f}ms'. format(batch_image_raw[ 0].shape, use_time * 1000))
  314. if __name__ == "__main__":
  315. # load custom plugins
  316. parser = argparse.ArgumentParser()
  317. parser.add_argument( '--engine', nargs= '+', type= str, default= "build/yolov5s.engine", help= '.engine path(s)')
  318. parser.add_argument( '--save', type= int, default= 0, help= 'save?')
  319. opt = parser.parse_args()
  320. PLUGIN_LIBRARY = "build/libmyplugins.so"
  321. engine_file_path = opt.engine
  322. ctypes.CDLL(PLUGIN_LIBRARY)
  323. # load coco labels
  324. categories = [ "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
  325. "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
  326. "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
  327. "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
  328. "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
  329. "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
  330. "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
  331. "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
  332. "hair drier", "toothbrush"]
  333. # a YoLov5TRT instance
  334. yolov5_wrapper = YoLov5TRT(engine_file_path)
  335. cap = cv2.VideoCapture( 0)
  336. try:
  337. thread1 = inferThread(yolov5_wrapper)
  338. thread1.start()
  339. thread1.join()
  340. while 1:
  341. _,frame = cap.read()
  342. img,t=thread1.infer(frame)
  343. cv2.imshow( "result", img)
  344. if cv2.waitKey( 1) & 0XFF == ord( 'q'): # 1 millisecond
  345. break
  346. finally:
  347. # destroy the instance
  348. cap.release()
  349. cv2.destroyAllWindows()
  350. yolov5_wrapper.destroy()

          最后在yolov5env环境中执行yolo_trt_test.py脚本。


    
    
  1. conda activate yolov5env
  2. python yolo_trt_test.py

 3、实现效果

    Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测,机器视觉,python,opencv,计算机视觉Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测,机器视觉,python,opencv,计算机视觉

                          (a)未加速                                                         (b)加速                                                                        

五、总结

       TensorRT加速对于深度学习模型在移动嵌入式部署十分重要,解决了一些算力较低的嵌入式设备无法部署深度学习算法或者部署效果差的情况。个人感觉当然使用v5.0的最好,它支持YOLOv5新出的几个模型加速。到此,我使用TensorRT加速yolov5的过程就到此结束,如果有问题可以随时问我,希望得到点赞和关注。翻过一座山又是一座山,下座山峰见。

六、参考文章

https://github.com/wang-xinyu/tensorrtx

到了这里,关于Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 【NVIDIA JETSON AGX XAVIER】与个人笔记本(win11)建立TCP-IP连接传输数据(含源码)

    提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档 NVIDIA JETSON AGX XAVIER当作客户端 个人笔记本(win11)当作服务器 首先要将Xavier与笔记本通过网线连接起来,实现Xavier联网(两者 可以相互ping通 就可以)并且需要知道 笔记本的ip地址 (假如现在是192.16

    2024年03月17日
    浏览(50)
  • Jetson nano部署Yolov5目标检测 + Tensor RT加速(超级详细版)

    在工作或学习中我们需要进行部署,下面这篇文章是我亲自部署jetson nano之后做出的总结,包括自己遇到一些报错和踩坑,希望对你们有所帮助 : ) 读卡器 SD卡  小螺丝刀 网线(更改语言需要网络) 烧录镜像就是要把SD卡里的东西给完全清除,好比我们电脑重装系统一样,

    2024年02月13日
    浏览(36)
  • [CV学习笔记]tensorrt加速篇之yolov5seg 实例分割

    1. 前言 yolov5-7.0版本继续更新了实例分割的代码,其分割的精度与速度令人惊讶,本文将yolov5-seg进行tensorrt加速,并利用矩阵的方法对进行部分后处理. 实例分割原理:yolact yolov5seg-cpp实现代码:Yolov5-instance-seg-tensorrt cpp矩阵实现:algorithm-cpp 本文测试代码:https://github.com/Rex-LK/tenso

    2024年02月02日
    浏览(81)
  • Jetson Nano部署YOLOv5与Tensorrtx加速——(自己走一遍全过程记录)

    搞了一下Jetson nano和YOLOv5,网上的资料大多重复也有许多的坑,在配置过程中摸爬滚打了好几天,出坑后决定写下这份教程供自己备忘。 事先声明,这篇文章的许多内容本身并不是原创,而是将配置过程中的文献进行了搜集整理,但是所有步骤都1:1复刻我的配置过程,包括其

    2024年02月03日
    浏览(45)
  • Jetson Xavier NX 上查看CUDA、Cudnn、Tensorrt、Opencv的版本及配置信息

    以上功能都可以通过jtop实现,下面是如何安装jtop: 启动后界面如下,可以查看到开发板资源使用情况,接着 点击info 如果安装成功,可以查看到 CUDA、cuDNN、opencv和TensorRT等版本信息 注意!如果是Jetpack5.0.0以上版本可能会看不到很多信息,因为稳定版本不支持,需要安装最新

    2024年02月15日
    浏览(105)
  • 【YOLO】Windows 下 YOLOv8 使用 TensorRT 进行模型加速部署

    本文全文参考文章为 win10下 yolov8 tensorrt模型加速部署【实战】 本文使用的代码仓库为 TensorRT-Alpha 注:其他 Yolov8 TensorRT 部署项目:YOLOv8 Tensorrt Python/C++部署教程 安装Visual Studio 2019或者Visual Studio 2022、Nvidia驱动 安装cuda,cudnn、opencv、tensorrt并进行相应的环境配置,这里不做配

    2024年02月11日
    浏览(34)
  • 利用python版tensorRT进行推理【以yolov5为例】

    上一篇文章中已经详细叙述了如何用tensorRT将onnx转为engine【利用python版tensorRT导出engine【以yolov5为例】_爱吃肉的鹏的博客-CSDN博客】。 本篇文章将继续讲解trt的推理部分。 与之前一样,在讲解之前需要先介绍一些专业术语,让大家看看这些内置函数都有什么功能。 1.Binding含

    2024年02月12日
    浏览(42)
  • 使用TensorRT对yolov5进行部署(基于python,超详细)

    哈喽大家好 ! 我是唐宋宋宋,很荣幸与您相见!!! 我的环境: cuda:11.1 cudnn:8.7.0 TensorRT:8.4.1.5 首先需要下载TensorRT,官网链接附下: NVIDIA TensorRT 8.x Download | NVIDIA Developer 注:下载TensorRT版本需要和你电脑上的cuda版本对应 yolov5的代码需要大家上github自己扒 链接已经提供

    2024年02月07日
    浏览(42)
  • 如何使用Nvidia Jetson AGX Orin训练YoloV8

    –shm-size可以视情况更改

    2024年02月03日
    浏览(49)
  • jetson nx目标检测环境配置遇到的一万个坑,安装v1.12.0版本的pytorch和v0.13.0版本的vision torchvision,以及使用TensorRT部署YOLOv5.

    本文参考了许多官网和博客,肯定是存在抄袭的,请各位大哥不要喷我啊。 自己工作找到的是医学信号方向的算法工程师,所以以后和CV可能无缘了,将自己一个多星期的心血历程发表出来,希望大家接起我的CV火炬,接着前行,各位加油!(后面也学习了yolov5-6.0 yolov7的模型

    2024年02月05日
    浏览(46)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包