YOLO，VOC数据集标注格式

这篇具有很好参考价值的文章主要介绍了YOLO，VOC数据集标注格式。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

YOLO，VOC数据集标注格式

一、YOLO数据集标注格式

YOLO数据集txt标注格式：

每个标签有五个数据，依次代表：

所标注内容的类别，数字与类别一一对应
归一化后中心点的x坐标
归一化后中心点的y坐标
归一化后目标框的宽度w
归一化后目标框的高度h

这里归一化是指除以原始图片的宽和高

二、VOC数据集标注格式

VOC数据集xml标注格式

<annotation>
	<folder>VOC</folder>
	<filename>bird_1.jpg</filename>    #图片名称以及图片格式
	<size>                             #图片的大小以及是否是rgb图片
		<width>1024</width>
		<height>688</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>                             #多个目标可以有多个object
		<name>bird</name>                #图片中一个目标的类别
		<pose>Unspecified</pose>         #物体的姿态
		<truncated>0</truncated>         #物体是否被部分遮挡（>15%）
		<difficult>0</difficult>         #是否为难以辨识的物体， 主要指要结体背景才	
		                                 #能判断出类别的物体。虽有标注， 但一般忽略这类物体
		<bndbox>                         #xmin和ymin是含目标框左上角的坐标
			<xmin>506</xmin>             #xmax和ymax是含目标框右下角的坐标
			<ymin>170</ymin>
			<xmax>509</xmax>
			<ymax>176</ymax>
		</bndbox>
	</object>
</annotation>

三、数据集格式转换

转换公式：

VOC->YOLO

	norm_x=(xmin + xmax)/2/width
	norm_y=(ymin + ymax)/2/height
	norm_w=(xmax - xmin)/width
	norm_h=(ymax - ymin)/height

YOLO->VOC

	xmin=width * (norm_x - 0.5 * norm_w)
	ymin=height * (norm_y - 0.5 * norm_h)

	xmax=width * (norm_x + 0.5 * norm_w)
	ymax=height * (norm_y + 0.5 * norm_h)

YOLO v5官方开源代码文章来源地址https://www.toymoban.com/news/detail-400586.html

import cv2
import numpy as np
import pandas as pd
import pkg_resources as pkg
import torch
import torchvision
import yaml


def clip_coords(boxes, shape):
    # Clip bounding xyxy bounding boxes to image shape (height, width)
    '''
    将boxes的坐标(x1y1x2y2左上角右下角)限定在图像的尺寸(img_shapehw)内，防止出界。
    '''
    if isinstance(boxes, torch.Tensor):  # faster individually
        boxes[:, 0].clamp_(0, shape[1])  # x1
        boxes[:, 1].clamp_(0, shape[0])  # y1
        boxes[:, 2].clamp_(0, shape[1])  # x2
        boxes[:, 3].clamp_(0, shape[0])  # y2
    else:  # np.array (faster grouped)
        boxes[:, [0, 2]] = boxes[:, [0, 2]].clip(0, shape[1])  # x1, x2
        boxes[:, [1, 3]] = boxes[:, [1, 3]].clip(0, shape[0])  # y1, y2


def xyxy2xywh(x):
    # Convert nx4 boxes from [x1, y1, x2, y2] to [x, y, w, h] where xy1=top-left, xy2=bottom-right
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = (x[:, 0] + x[:, 2]) / 2  # x center
    y[:, 1] = (x[:, 1] + x[:, 3]) / 2  # y center
    y[:, 2] = x[:, 2] - x[:, 0]  # width
    y[:, 3] = x[:, 3] - x[:, 1]  # height
    return y


def xywh2xyxy(x):
    # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
    y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
    y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
    y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
    return y


def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0):
    # Convert nx4 boxes from [x, y, w, h] normalized to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = w * (x[:, 0] - x[:, 2] / 2) + padw  # top left x
    y[:, 1] = h * (x[:, 1] - x[:, 3] / 2) + padh  # top left y
    y[:, 2] = w * (x[:, 0] + x[:, 2] / 2) + padw  # bottom right x
    y[:, 3] = h * (x[:, 1] + x[:, 3] / 2) + padh  # bottom right y
    return y


def xyxy2xywhn(x, w=640, h=640, clip=False, eps=0.0):
    # Convert nx4 boxes from [x1, y1, x2, y2] to [x, y, w, h] normalized where xy1=top-left, xy2=bottom-right
    if clip:
        clip_coords(x, (h - eps, w - eps))  # warning: inplace clip
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = ((x[:, 0] + x[:, 2]) / 2) / w  # x center
    y[:, 1] = ((x[:, 1] + x[:, 3]) / 2) / h  # y center
    y[:, 2] = (x[:, 2] - x[:, 0]) / w  # width
    y[:, 3] = (x[:, 3] - x[:, 1]) / h  # height
    return y


def xyn2xy(x, w=640, h=640, padw=0, padh=0):
    # Convert normalized segments into pixel segments, shape (n,2)
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = w * x[:, 0] + padw  # top left x
    y[:, 1] = h * x[:, 1] + padh  # top left y
    return y