1 ResNet介绍
1.1 ResNet概述
RestNet是2015年由微软团队提出的,在当时获得分类任务,目标检测,图像分割第一名。该论文的四位作者何恺明、张祥雨、任少卿和孙剑如今在人工智能领域里都是响当当的名字,当时他们都是微软亚研的一员。实验结果显示,残差网络更容易优化,并且加深网络层数有助于提高正确率。在ImageNet上使用152层的残差网络(VGG net的8倍深度,但残差网络复杂度更低)。对这些网络使用集成方法实现了3.75%的错误率。获得了ILSVRC 2015竞赛的第一名。
论文地址:原文链接
这是一篇计算机视觉领域的经典论文。李沐曾经说过,假设你在使用卷积神经网络,有一半的可能性就是在使用 ResNet 或它的变种。ResNet 论文被引用数量突破了 10 万+。
1.2 ResNet网络结构
ResNet的经典网络结构有:ResNet-18、ResNet-34、ResNet-50、ResNet-101、ResNet-152几种,其中,ResNet-18和ResNet-34的基本结构相同,属于相对浅层的网络,后面3种属于更深层的网络,其中RestNet50最为常用。
残差网络是为了解决深度神经网络(DNN)隐藏层过多时的网络退化问题而提出。退化(degradation)问题是指:当网络隐藏层变多时,网络的准确度达到饱和然后急剧退化,而且这个退化不是由于过拟合引起的。
假设一个网络 A,训练误差为 x。在 A 的顶部添加几个层构建网络 B,这些层的参数对于 A 的输出没有影响,我们称这些层为 C。这意味着新网络 B 的训练误差也是 x。网络 B 的训练误差不应高于 A,如果出现 B 的训练误差高于 A 的情况,则使用添加的层 C 学习恒等映射(对输入没有影响)并不是一个平凡问题。
为了解决这个问题,上图中的模块在输入和输出之间添加了一个直连路径,以直接执行映射。这时,C 只需要学习已有的输入特征就可以了。由于 C 只学习残差,该模块叫作残差模块。
此外,和当年几乎同时推出的 GoogLeNet 类似,它也在分类层之后连接了一个全局平均池化层。通过这些变化,ResNet 可以学习 152 个层的深层网络。它可以获得比 VGGNet 和 GoogLeNet 更高的准确率,同时计算效率比 VGGNet 更高。ResNet-152 可以取得 95.51% 的 top-5 准确率。
RestNet18和RestNet50网络结构如下:
2 基于pytorch在CIFAR10数据下的RestNet50的实现
2.1 cifar-10数据集
Cifar-10 是由 Hinton 的学生 Alex Krizhevsky、Ilya Sutskever 收集的一个用于普适物体识别的计算机视觉数据集,它包含 60000 张 32 X 32 的 RGB 彩色图片,总共 10 个分类。其中,包括 50000 张用于训练集,10000 张用于测试集。
CIFAR-10数据集中一共包含10 个类别的RGB 彩色图片:飞机( airplane )、汽车( automobile )、鸟类( bird )、猫( cat )、鹿( deer )、狗( dog )、蛙类( frog )、马( horse )、船( ship )和卡车( truck )。
CIFAR-10是一个更接近普适物体的彩色图像数据集。与MNIST数据集相比, CIFAR-10具有以下不同点:
- CIFAR-10 是3 通道的彩色RGB 图像,而MNIST 是灰度图像。
- CIFAR-10 的图片尺寸为32 × 32 , 而MNIST 的图片尺寸为28 × 28 ,比MNIST 稍大。
相比于手写字符,CIFAR-10含有的是现实世界中真实的物体,不仅噪声很大,而且物体的比例、特征都不尽相同,这为识别带来很大困难。直接的线性模型如Softmax 在CIFAR-10 上表现得很差。
2.2 代码实现
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, utils
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt
from torchvision.transforms import transforms
import torch.nn.functional as F
import datetime
import numpy as np
class Bottleneck(nn.Module):
def __init__(self, in_channels, out_channels, stride=[1, 1, 1], padding=[0, 1, 0], first=False) -> None:
super(Bottleneck, self).__init__()
self.bottleneck = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride[0], padding=padding[0], bias=False),
nn.BatchNorm2d(out_channels),
nn.ReLU(inplace=True),
nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=stride[1], padding=padding[1], bias=False),
nn.BatchNorm2d(out_channels),
nn.ReLU(inplace=True),
nn.Conv2d(out_channels, out_channels * 4, kernel_size=1, stride=stride[2], padding=padding[2], bias=False),
nn.BatchNorm2d(out_channels * 4)
)
# 由于存在维度不一致的情况 所以分情况
self.shortcut = nn.Sequential()
if first:
self.shortcut = nn.Sequential(
# 卷积核为1 进行升降维
# 注意跳变时 都是stride==2的时候 也就是每次输出信道升维的时候
nn.Conv2d(in_channels, out_channels * 4, kernel_size=1, stride=stride[1], bias=False),
nn.BatchNorm2d(out_channels * 4)
)
def forward(self, x):
out = self.bottleneck(x)
out += self.shortcut(x)
out = F.relu(out)
return out
class ResNet50(nn.Module):
def __init__(self, Bottleneck, num_classes=10) -> None:
super(ResNet50, self).__init__()
self.in_channels = 64
self.conv1 = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False),
nn.BatchNorm2d(64),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
)
self.conv2 = self._make_layer(Bottleneck, 64, [[1, 1, 1]] * 3, [[0, 1, 0]] * 3)
self.conv3 = self._make_layer(Bottleneck, 128, [[1, 2, 1]] + [[1, 1, 1]] * 3, [[0, 1, 0]] * 4)
self.conv4 = self._make_layer(Bottleneck, 256, [[1, 2, 1]] + [[1, 1, 1]] * 5, [[0, 1, 0]] * 6)
self.conv5 = self._make_layer(Bottleneck, 512, [[1, 2, 1]] + [[1, 1, 1]] * 2, [[0, 1, 0]] * 3)
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(2048, num_classes)
def _make_layer(self, block, out_channels, strides, paddings):
layers = []
flag = True
for i in range(0, len(strides)):
layers.append(block(self.in_channels, out_channels, strides[i], paddings[i], first=flag))
flag = False
self.in_channels = out_channels * 4
return nn.Sequential(*layers)
def forward(self, x):
out = self.conv1(x)
out = self.conv2(out)
out = self.conv3(out)
out = self.conv4(out)
out = self.conv5(out)
out = self.avgpool(out)
out = out.reshape(x.shape[0], -1)
out = self.fc(out)
return out
def get_format_time():
return datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
transform = transforms.Compose([ToTensor(),
transforms.Normalize(
mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]
),
transforms.Resize((224, 224))
])
training_data = datasets.CIFAR10(
root="data",
train=True,
download=True,
transform=transform,
)
testing_data = datasets.CIFAR10(
root="data",
train=False,
download=True,
transform=transform,
)
if __name__ == "__main__":
res50 = ResNet50(Bottleneck)
batch_size = 128
train_loader = DataLoader(dataset=training_data, batch_size=batch_size, shuffle=True, drop_last=True)
test_loader = DataLoader(dataset=testing_data, batch_size=batch_size, shuffle=True, drop_last=True)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = res50.to(device)
cost = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
epochs = 20
accuracy_rate = []
for epoch in range(epochs):
train_loss = 0.0
train_correct = 0.0
model.train()
print(f"{get_format_time()}, train epoch: {epoch}/{epochs}")
for step, (images, labels) in enumerate(train_loader, 0):
images, labels = images.to(device), labels.to(device)
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
optimizer.zero_grad()
loss = cost(outputs, labels)
loss.backward()
optimizer.step()
train_loss += loss.item()
train_correct += torch.sum(predicted == labels.data)
# 在测试集上进行验证
model.eval()
test_correct = 0
test_total = 0
test_loss = 0
with torch.no_grad():
for images, labels in test_loader:
images, labels = images.to(device), labels.to(device)
outputs = model(images).to(device)
loss = cost(outputs, labels)
_, predicted = torch.max(outputs, 1)
test_total += labels.size(0)
test_correct += torch.sum(predicted == labels.data)
test_loss += loss.item()
accuracy = 100 * test_correct / test_total
accuracy_rate.append(accuracy)
print("{}, Train Loss is:{:.4f}, Train Accuracy is:{:.4f}%, Test Loss is::{:.4f} Test Accuracy is:{:.4f}%".format(
get_format_time(),
train_loss / len(training_data),
100 * train_correct / len(training_data),
test_loss / len(testing_data),
100 * test_correct / len(testing_data)
))
accuracy_rate = torch.tensor(accuracy_rate).detach().cpu().numpy()
times = np.linspace(1, epochs, epochs)
plt.xlabel('times')
plt.ylabel('accuracy rate')
plt.plot(times, accuracy_rate)
plt.show()
print(f"{datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')},accuracy_rate={accuracy_rate}")
2.3 运行环境准备
(1)如果运行环境为cpu,环境准备如下:
conda create -n cv python=3.9
conda activate cv
pip install torchvision==0.9.0
pip install numpy
pip install matplotlib
pip install requests
(2)如果运行环境GPU,环境准备如下:
通过nvidia-smi命令,查找cuda对应的版本:
Tue May 23 15:24:10 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 528.89 Driver Version: 528.89 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 TCC | 00000000:01:00.0 Off | 0 |
| N/A 55C P8 11W / 70W | 0MiB / 15360MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
构建运行环境,在torch的GPU版本获取对应的版本进行安装
conda create -n cv python=3.9
conda activate cv
pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install numpy
pip install matplotlib
pip install requests
这是通过nvidia-smi命令,看到已经在GPU上运行:文章来源:https://www.toymoban.com/news/detail-857073.html
Tue May 23 15:25:25 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 528.89 Driver Version: 528.89 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 TCC | 00000000:01:00.0 Off | 0 |
| N/A 56C P0 28W / 70W | 1101MiB / 15360MiB | 3% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 6728 C ...nda\envs\voice\python.exe 1100MiB |
+-----------------------------------------------------------------------------+
2.4 运行结果展示
文章来源地址https://www.toymoban.com/news/detail-857073.html
2023-12-22 14:44:39, train epoch: 0/20
2023-12-22 14:46:21, Train Loss is:0.0126, Train Accuracy is:40.9520%, Test Loss is::0.0116 Test Accuracy is:46.3200%
2023-12-22 14:46:21, train epoch: 1/20
2023-12-22 14:48:01, Train Loss is:0.0087, Train Accuracy is:59.5060%, Test Loss is::0.0109 Test Accuracy is:51.6700%
2023-12-22 14:48:01, train epoch: 2/20
2023-12-22 14:49:40, Train Loss is:0.0070, Train Accuracy is:68.1060%, Test Loss is::0.0072 Test Accuracy is:67.8100%
2023-12-22 14:49:40, train epoch: 3/20
2023-12-22 14:51:20, Train Loss is:0.0057, Train Accuracy is:74.2540%, Test Loss is::0.0073 Test Accuracy is:67.7400%
2023-12-22 14:51:20, train epoch: 4/20
2023-12-22 14:53:00, Train Loss is:0.0049, Train Accuracy is:77.9280%, Test Loss is::0.0061 Test Accuracy is:73.7400%
2023-12-22 14:53:00, train epoch: 5/20
2023-12-22 14:54:41, Train Loss is:0.0042, Train Accuracy is:81.3260%, Test Loss is::0.0049 Test Accuracy is:77.9900%
2023-12-22 14:54:41, train epoch: 6/20
2023-12-22 14:56:20, Train Loss is:0.0036, Train Accuracy is:83.9240%, Test Loss is::0.0047 Test Accuracy is:79.0400%
2023-12-22 14:56:20, train epoch: 7/20
2023-12-22 14:58:00, Train Loss is:0.0031, Train Accuracy is:86.0780%, Test Loss is::0.0059 Test Accuracy is:75.6300%
2023-12-22 14:58:00, train epoch: 8/20
2023-12-22 14:59:39, Train Loss is:0.0027, Train Accuracy is:87.7120%, Test Loss is::0.0048 Test Accuracy is:79.7600%
2023-12-22 14:59:39, train epoch: 9/20
2023-12-22 15:01:19, Train Loss is:0.0023, Train Accuracy is:89.3680%, Test Loss is::0.0048 Test Accuracy is:80.5800%
2023-12-22 15:01:19, train epoch: 10/20
2023-12-22 15:02:58, Train Loss is:0.0019, Train Accuracy is:91.2760%, Test Loss is::0.0044 Test Accuracy is:82.3400%
2023-12-22 15:02:58, train epoch: 11/20
2023-12-22 15:04:38, Train Loss is:0.0016, Train Accuracy is:92.4040%, Test Loss is::0.0045 Test Accuracy is:82.6400%
2023-12-22 15:04:38, train epoch: 12/20
2023-12-22 15:06:18, Train Loss is:0.0014, Train Accuracy is:93.7200%, Test Loss is::0.0053 Test Accuracy is:81.7900%
2023-12-22 15:06:18, train epoch: 13/20
2023-12-22 15:07:57, Train Loss is:0.0011, Train Accuracy is:94.7360%, Test Loss is::0.0051 Test Accuracy is:81.7700%
2023-12-22 15:07:57, train epoch: 14/20
2023-12-22 15:09:37, Train Loss is:0.0010, Train Accuracy is:95.1120%, Test Loss is::0.0062 Test Accuracy is:80.6500%
2023-12-22 15:09:37, train epoch: 15/20
2023-12-22 15:11:15, Train Loss is:0.0008, Train Accuracy is:96.1600%, Test Loss is::0.0056 Test Accuracy is:82.0300%
2023-12-22 15:11:15, train epoch: 16/20
2023-12-22 15:12:54, Train Loss is:0.0007, Train Accuracy is:96.6140%, Test Loss is::0.0055 Test Accuracy is:82.4200%
2023-12-22 15:12:54, train epoch: 17/20
2023-12-22 15:14:34, Train Loss is:0.0007, Train Accuracy is:96.8880%, Test Loss is::0.0068 Test Accuracy is:81.1300%
2023-12-22 15:14:34, train epoch: 18/20
2023-12-22 15:16:13, Train Loss is:0.0006, Train Accuracy is:97.0620%, Test Loss is::0.0062 Test Accuracy is:82.1900%
2023-12-22 15:16:13, train epoch: 19/20
2023-12-22 15:17:52, Train Loss is:0.0006, Train Accuracy is:97.4180%, Test Loss is::0.0063 Test Accuracy is:82.7800%
2023-12-22 15:17:53,accuracy_rate=[46.39423 51.752804 67.91867 67.84856 73.85818 78.11498 79.166664
75.751205 79.887825 80.70914 82.471954 82.77244 81.921074 81.90104
80.77925 82.16146 82.552086 81.26002 82.32172 82.91266 ]
到了这里,关于计算机视觉之ResNet的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!