1 MobileNetV3介绍
1.1 MobileNet概述
MobileNetV3 是由 google 团队在 2019 年提出的轻量化网络模型,传统的卷积神经网络,内容需求大,运算量大,无法再移动设备以及嵌入式设备上运行,为了解决这一问题,MobileNet网络应运而生。MobileNetV3在移动端图像分类、目标检测、语义分割等任务上均取得了优秀的表现。MobileNetV3采用了很多新的技术,包括针对通道注意力的Squeeze-and-Excitation模块、NAS搜索方法等,这些方法都有利于进一步提升网络的性能。
MobileNetV3论文地址:原文链接
MobileNetV3的整体架构基本沿用了MobileNetV2的设计,采用了轻量级的深度可分离卷积和残差块等结构,依然是由多个模块组成,但是每个模块得到了优化和升级,包括瓶颈结构、SE模块和NL模块。MobileNetV3在ImageNet 分类任务中正确率上升了 3.2%,计算延时还降低了20%。
整体来说MobileNetV3有两大创新点:
- 互补搜索技术组合:由资源受限的NAS执行模块级搜索,NetAdapt执行局部搜索。
- 网络结构改进:将最后一步的平均池化层前移并移除最后一个卷积层,引入h-swish激活函数。
MobileNetV3 有两个版本,MobileNetV3-Small 与 MobileNetV3-Large 分别对应对计算和存储要求低和高的版本。
1.2 MobileNetV3的网络结构
这是论文中给出的网络结构,值得注意的是第一个卷积核的个数为16,并且采用了HS激活函数;表中exp_size代表benck中第一部分升维后的channel,SE代表是否使用SE模块,NL表示激活函数的类型,HS代表hard-swish激活函数,RE代表ReLU激活函数,s代表步长。
1)MobileNetV3-Large的网络结构:
2)MobileNetV3-Small的网络结构:
MobileNetV3特有的bneck结构:
1.3 MobileNet模型比较
2 MobileNetV3在CIFAR10数据集上的实现
2.1 cifar-10数据集
Cifar-10 是由 Hinton 的学生 Alex Krizhevsky、Ilya Sutskever 收集的一个用于普适物体识别的计算机视觉数据集,它包含 60000 张 32 X 32 的 RGB 彩色图片,总共 10 个分类。其中,包括 50000 张用于训练集,10000 张用于测试集。
CIFAR-10数据集中一共包含10 个类别的RGB 彩色图片:飞机( airplane )、汽车( automobile )、鸟类( bird )、猫( cat )、鹿( deer )、狗( dog )、蛙类( frog )、马( horse )、船( ship )和卡车( truck )。
CIFAR-10是一个更接近普适物体的彩色图像数据集。与MNIST数据集相比, CIFAR-10具有以下不同点:
- CIFAR-10 是3 通道的彩色RGB 图像,而MNIST 是灰度图像。
- CIFAR-10 的图片尺寸为32 × 32 , 而MNIST 的图片尺寸为28 × 28 ,比MNIST 稍大。
相比于手写字符,CIFAR-10含有的是现实世界中真实的物体,不仅噪声很大,而且物体的比例、特征都不尽相同,这为识别带来很大困难。直接的线性模型如Softmax 在CIFAR-10 上表现得很差。文章来源:https://www.toymoban.com/news/detail-754805.html
2.2 基于pytorch的代码实现
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import transforms
from torchvision.transforms.transforms import ToTensor
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
import torch
import datetime
class hswish(nn.Module):
def __init__(self, inplace=True):
super(hswish, self).__init__()
self.inplace = inplace
def forward(self, x):
f = nn.functional.relu6(x + 3., inplace=self.inplace) / 6.
return x * f
class hsigmoid(nn.Module):
def __init__(self, inplace=True):
super(hsigmoid, self).__init__()
self.inplace = inplace
def forward(self, x):
f = nn.functional.relu6(x + 3., inplace=self.inplace) / 6.
return f
class SeModule(nn.Module):
def __init__(self, in_channels, se_ratio=0.25):
super(SeModule, self).__init__()
self.se_reduce = nn.Conv2d(in_channels, int(in_channels * se_ratio), kernel_size=1, stride=1, padding=0)
self.se_expand = nn.Conv2d(int(in_channels * se_ratio), in_channels, kernel_size=1, stride=1, padding=0)
def forward(self, x):
s = nn.functional.adaptive_avg_pool2d(x, 1)
s = self.se_expand(nn.functional.relu(self.se_reduce(s), inplace=True))
return x * s.sigmoid()
class ConvBlock(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride, padding, groups=1):
super(ConvBlock, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, groups=groups, bias=False)
self.bn = nn.BatchNorm2d(out_channels)
self.act = hswish()
def forward(self, x):
return self.act(self.bn(self.conv(x)))
class SqueezeExcitation(nn.Module):
def __init__(self, in_channel, out_channel, reduction=4):
super(SqueezeExcitation, self).__init__()
self.pool = nn.AdaptiveAvgPool2d(1)
self.fc1 = nn.Conv2d(in_channel, out_channel // reduction, kernel_size=1, stride=1)
self.relu = nn.ReLU(inplace=True)
self.fc2 = nn.Conv2d(out_channel // reduction, out_channel, kernel_size=1, stride=1)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
out = self.pool(x)
out = self.fc1(out)
out = self.relu(out)
out = self.fc2(out)
out = self.sigmoid(out)
return out
class ResidualBlock(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride, use_se=True):
super(ResidualBlock, self).__init__()
self.conv1 = ConvBlock(in_channels, out_channels, kernel_size, stride, kernel_size // 2)
self.conv2 = ConvBlock(out_channels, out_channels, kernel_size, 1, kernel_size // 2)
self.use_se = use_se
if use_se:
self.se = SqueezeExcitation(out_channels, out_channels)
self.shortcut = nn.Sequential()
if stride != 1 or in_channels != out_channels:
self.shortcut = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(out_channels)
)
def forward(self, x):
out = self.conv1(x)
out = self.conv2(out)
if self.use_se:
out = out * self.se(out)
out += self.shortcut(x)
out = nn.functional.relu(out, inplace=True)
return out
class MobileNetV3Large(nn.Module):
def __init__(self, num_classes=1000):
super(MobileNetV3Large, self).__init__() #
self.conv1 = ConvBlock(3, 16, 3, 2, 1) # 1/2
self.bottlenecks = nn.Sequential(
ResidualBlock(16, 16, 3, 1, False),
ResidualBlock(16, 24, 3, 2, False), # 1/4
ResidualBlock(24, 24, 3, 1, False),
ResidualBlock(24, 40, 5, 2, True), # 1/8
ResidualBlock(40, 40, 5, 1, True),
ResidualBlock(40, 40, 5, 1, True),
ResidualBlock(40, 80, 3, 2, False), # 1/16
ResidualBlock(80, 80, 3, 1, False),
ResidualBlock(80, 80, 3, 1, False),
ResidualBlock(80, 112, 5, 1, True),
ResidualBlock(112, 112, 5, 1, True),
ResidualBlock(112, 160, 5, 2, True), # 1/32
ResidualBlock(160, 160, 5, 1, True),
ResidualBlock(160, 160, 5, 1, True)
)
self.conv2 = ConvBlock(160, 960, 1, 1, 0)
self.pool = nn.AdaptiveAvgPool2d(1)
self.fc = nn.Sequential(
nn.Linear(960, 1280),
nn.BatchNorm1d(1280),
nn.Hardswish(inplace=True),
nn.Linear(1280, num_classes),
)
def forward(self, x):
out = self.conv1(x)
out = self.bottlenecks(out)
out = self.conv2(out)
out = self.pool(out)
out = out.reshape(out.size(0), -1)
out = self.fc(out)
return out
class MobileNetV3Small(nn.Module):
def __init__(self, num_classes=1000):
super(MobileNetV3Small, self).__init__()
self.conv1 = ConvBlock(3, 16, 3, 2, 1) # 1/2
self.bottlenecks = nn.Sequential(
ResidualBlock(16, 16, 3, 2, False), # 1/4
ResidualBlock(16, 72, 3, 2, False), # 1/8
ResidualBlock(72, 72, 3, 1, False),
ResidualBlock(72, 72, 3, 1, True),
ResidualBlock(72, 96, 3, 2, True), # 1/16
ResidualBlock(96, 96, 3, 1, True),
ResidualBlock(96, 96, 3, 1, True),
ResidualBlock(96, 240, 5, 2, True), # 1/32
ResidualBlock(240, 240, 5, 1, True),
ResidualBlock(240, 240, 5, 1, True),
ResidualBlock(240, 480, 5, 1, True),
ResidualBlock(480, 480, 5, 1, True),
ResidualBlock(480, 480, 5, 1, True),
)
self.conv2 = ConvBlock(480, 576, 1, 1, 0, groups=2)
self.conv3 = nn.Conv2d(576, 1024, kernel_size=1, stride=1, padding=0, bias=False)
self.bn = nn.BatchNorm2d(1024)
self.act = hswish()
self.pool = nn.AdaptiveAvgPool2d(1)
self.fc = nn.Linear(1024, num_classes)
def forward(self, x):
out = self.conv1(x)
out = self.bottlenecks(out)
out = self.conv2(out)
out = self.conv3(out)
out = self.bn(out)
out = self.act(out)
out = self.pool(out)
out = out.reshape(out.size(0), -1)
out = self.fc(out)
return out
transform = transforms.Compose([ToTensor(),
transforms.Normalize(
mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]
),
transforms.Resize((224, 224))
])
train_data = datasets.CIFAR10(
root="data",
train=True,
download=True,
transform=transform,
)
test_data = datasets.CIFAR10(
root="data",
train=False,
download=True,
transform=transform,
)
def get_format_time():
return datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
if __name__ == '__main__':
batch_size = 64
train_loader = DataLoader(dataset=train_data, batch_size=batch_size, shuffle=True, drop_last=True)
test_loader = DataLoader(dataset=test_data, batch_size=batch_size, shuffle=True, drop_last=True)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = MobileNetV3Large(num_classes=10).to(device)
print(model)
cross = nn.CrossEntropyLoss().to(device)
optimizer = torch.optim.Adam(model.parameters(), 0.001)
train_loss = 0
train_accuracy = 0
epochs = 10
accuracy_rate = []
for epoch in range(epochs):
print(f'{get_format_time()}, train epoch: {epoch}/{epochs}')
train_correct = 0
for step, (images, labels) in enumerate(train_loader, 0):
images, labels = images.to(device), labels.to(device)
outputs = model.forward(images)
train_loss = cross(outputs, labels)
train_loss.backward()
optimizer.zero_grad()
optimizer.step()
predicted = torch.argmax(outputs, 1)
correct = torch.sum(predicted == labels)
train_correct += correct
train_accuracy = train_correct / len(train_data)
print(f"{get_format_time()}, loss:{train_loss.item()}, accuracy:{train_accuracy}")
test_total = 0
test_correct = 0
test_loss = 0
with torch.no_grad():
for images, labels in test_loader:
images, labels = images.to(device), labels.to(device)
outputs = model(images).to(device)
loss = cross(outputs, labels)
_, predicted = torch.max(outputs, 1)
test_total += labels.size(0)
test_correct += torch.sum(predicted == labels.data)
test_loss += loss.item()
accuracy = 100 * test_correct / test_total
accuracy_rate.append(accuracy)
print("{}, Train Loss is:{:.4f}, Train Accuracy is:{:.4f}%, Test Loss is::{:.4f} Test Accuracy is:{:.4f}%".format(
get_format_time(),
train_loss / len(train_data),
100 * train_correct / len(train_data),
test_loss / len(test_data),
100 * test_correct / len(test_data)
))
2.3 运行结果
文章来源地址https://www.toymoban.com/news/detail-754805.html
2023-12-22 16:21:26, train epoch: 0/20
2023-12-22 16:21:53, loss:1.077600359916687, accuracy:0.4221999943256378
2023-12-22 16:21:58, Train Loss is:0.0000, Train Accuracy is:42.2200%, Test Loss is::0.0183 Test Accuracy is:57.5800%
2023-12-22 16:21:58, train epoch: 1/20
2023-12-22 16:22:22, loss:0.8761021494865417, accuracy:0.6502999663352966
2023-12-22 16:22:27, Train Loss is:0.0000, Train Accuracy is:65.0300%, Test Loss is::0.0133 Test Accuracy is:69.8400%
2023-12-22 16:22:27, train epoch: 2/20
2023-12-22 16:23:03, loss:0.8734554648399353, accuracy:0.7473799586296082
2023-12-22 16:23:10, Train Loss is:0.0000, Train Accuracy is:74.7380%, Test Loss is::0.0106 Test Accuracy is:76.7300%
2023-12-22 16:23:10, train epoch: 3/20
2023-12-22 16:23:56, loss:0.6179424524307251, accuracy:0.8049399852752686
2023-12-22 16:24:04, Train Loss is:0.0000, Train Accuracy is:80.4940%, Test Loss is::0.0094 Test Accuracy is:79.2400%
2023-12-22 16:24:04, train epoch: 4/20
2023-12-22 16:24:50, loss:0.4694249629974365, accuracy:0.8365799784660339
2023-12-22 16:24:58, Train Loss is:0.0000, Train Accuracy is:83.6580%, Test Loss is::0.0085 Test Accuracy is:81.5500%
2023-12-22 16:24:58, train epoch: 5/20
2023-12-22 16:25:44, loss:0.31567543745040894, accuracy:0.8552799820899963
2023-12-22 16:25:52, Train Loss is:0.0000, Train Accuracy is:85.5280%, Test Loss is::0.0086 Test Accuracy is:81.5900%
2023-12-22 16:25:52, train epoch: 6/20
2023-12-22 16:26:38, loss:0.16210773587226868, accuracy:0.88646000623703
2023-12-22 16:26:45, Train Loss is:0.0000, Train Accuracy is:88.6460%, Test Loss is::0.0077 Test Accuracy is:83.7600%
2023-12-22 16:26:45, train epoch: 7/20
2023-12-22 16:27:31, loss:0.3127828538417816, accuracy:0.9122599959373474
2023-12-22 16:27:39, Train Loss is:0.0000, Train Accuracy is:91.2260%, Test Loss is::0.0079 Test Accuracy is:83.8300%
2023-12-22 16:27:39, train epoch: 8/20
2023-12-22 16:28:25, loss:0.26726457476615906, accuracy:0.9293599724769592
2023-12-22 16:28:33, Train Loss is:0.0000, Train Accuracy is:92.9360%, Test Loss is::0.0085 Test Accuracy is:83.9400%
2023-12-22 16:28:33, train epoch: 9/20
2023-12-22 16:29:19, loss:0.2306433916091919, accuracy:0.9438599944114685
2023-12-22 16:29:26, Train Loss is:0.0000, Train Accuracy is:94.3860%, Test Loss is::0.0093 Test Accuracy is:83.2100%
2023-12-22 16:29:26, train epoch: 10/20
2023-12-22 16:30:12, loss:0.13097506761550903, accuracy:0.9565799832344055
2023-12-22 16:30:20, Train Loss is:0.0000, Train Accuracy is:95.6580%, Test Loss is::0.0107 Test Accuracy is:82.6900%
2023-12-22 16:30:20, train epoch: 11/20
2023-12-22 16:31:06, loss:0.06876415014266968, accuracy:0.962939977645874
2023-12-22 16:31:14, Train Loss is:0.0000, Train Accuracy is:96.2940%, Test Loss is::0.0103 Test Accuracy is:83.4400%
2023-12-22 16:31:14, train epoch: 12/20
2023-12-22 16:32:00, loss:0.12005764991044998, accuracy:0.9681999683380127
2023-12-22 16:32:07, Train Loss is:0.0000, Train Accuracy is:96.8200%, Test Loss is::0.0099 Test Accuracy is:84.0200%
2023-12-22 16:32:07, train epoch: 13/20
2023-12-22 16:32:53, loss:0.13355214893817902, accuracy:0.9756399989128113
2023-12-22 16:33:00, Train Loss is:0.0000, Train Accuracy is:97.5640%, Test Loss is::0.0106 Test Accuracy is:84.5400%
2023-12-22 16:33:00, train epoch: 14/20
2023-12-22 16:33:46, loss:0.025063637644052505, accuracy:0.9772999882698059
2023-12-22 16:33:54, Train Loss is:0.0000, Train Accuracy is:97.7300%, Test Loss is::0.0104 Test Accuracy is:85.0300%
2023-12-22 16:33:54, train epoch: 15/20
2023-12-22 16:34:40, loss:0.09421717375516891, accuracy:0.9760399460792542
2023-12-22 16:34:48, Train Loss is:0.0000, Train Accuracy is:97.6040%, Test Loss is::0.0113 Test Accuracy is:84.1100%
2023-12-22 16:34:48, train epoch: 16/20
2023-12-22 16:35:35, loss:0.05912297964096069, accuracy:0.982479989528656
2023-12-22 16:35:42, Train Loss is:0.0000, Train Accuracy is:98.2480%, Test Loss is::0.0115 Test Accuracy is:84.1600%
2023-12-22 16:35:42, train epoch: 17/20
2023-12-22 16:36:29, loss:0.023777423426508904, accuracy:0.9840799570083618
2023-12-22 16:36:36, Train Loss is:0.0000, Train Accuracy is:98.4080%, Test Loss is::0.0127 Test Accuracy is:83.9600%
2023-12-22 16:36:36, train epoch: 18/20
2023-12-22 16:37:22, loss:0.05668738856911659, accuracy:0.984779953956604
2023-12-22 16:37:30, Train Loss is:0.0000, Train Accuracy is:98.4780%, Test Loss is::0.0122 Test Accuracy is:84.5400%
2023-12-22 16:37:30, train epoch: 19/20
2023-12-22 16:38:16, loss:0.017958272248506546, accuracy:0.982759952545166
2023-12-22 16:38:24, Train Loss is:0.0000, Train Accuracy is:98.2760%, Test Loss is::0.0113 Test Accuracy is:84.3800%
2023-12-22 16:38:24,accuracy_rate=[57.672276 69.95193 76.85297 79.36699 81.68069 81.72076 83.89423
83.96435 84.074524 83.34335 82.82252 83.57372 84.15465 84.67548
85.16627 84.24479 84.294876 84.09455 84.67548 84.51523 ]
到了这里,关于计算机视觉之MobileNetV3的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!