基于Stanford Cars的ResNet和GoogLeNet图像识别

这篇具有很好参考价值的文章主要介绍了基于Stanford Cars的ResNet和GoogLeNet图像识别。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

斯坦福汽车分类

这是一个使用斯坦福汽车数据集进行汽车分类的深度学习项目。我将使用迁移学习在ImageNet上预训练的深度网络，并对数据集进行微调，为了减少训练时间我把数据集。

数据来源：https://ai.stanford.edu/~jkrause/cars/car_dataset.html

1. 数据描述

斯坦福汽车数据集包含 195 类汽车的 16,185 张图像。数据被分成 8,144 张训练图像和 8,041 张测试图像，每个类也被分成大约 50-50。类通常处于品牌、型号和年份级别，例如。2012 款特斯拉 Model S 或 2012 款宝马 M3 轿跑车。平均而言，训练集中每个类别有 41.5 张图像，测试集中有 40.5 张图像。

2. 设置环境并加载数据

2.1 下载读取数据

#提取类别名：
import scipy.io
 
data = scipy.io.loadmat('data/cars_annos.mat')
class_names = data['class_names']
f_class = open('./label_map.txt','w')
 
num = 1
for j in range(class_names.shape[1]):
    class_name = str(class_names[0,j][0]).replace(' ','_')
    print(num,class_name)
    f_class.write( str(num) + ' ' + class_name + '\n')
    num = num + 1
f_class.close()

1 AM_General_Hummer_SUV_2000
2 Acura_RL_Sedan_2012
3 Acura_TL_Sedan_2012
4 Acura_TL_Type-S_2008
5 Acura_TSX_Sedan_2012
6 Acura_Integra_Type_R_2001
7 Acura_ZDX_Hatchback_2012
8 Aston_Martin_V8_Vantage_Convertible_2012
9 Aston_Martin_V8_Vantage_Coupe_2012
10 Aston_Martin_Virage_Convertible_2012
11 Aston_Martin_Virage_Coupe_2012
12 Audi_RS_4_Convertible_2008
13 Audi_A5_Coupe_2012
14 Audi_TTS_Coupe_2012
15 Audi_R8_Coupe_2012
16 Audi_V8_Sedan_1994
17 Audi_100_Sedan_1994
18 Audi_100_Wagon_1994
19 Audi_TT_Hatchback_2011
20 Audi_S6_Sedan_2011
21 Audi_S5_Convertible_2012
22 Audi_S5_Coupe_2012
23 Audi_S4_Sedan_2012
24 Audi_S4_Sedan_2007
25 Audi_TT_RS_Coupe_2012
26 BMW_ActiveHybrid_5_Sedan_2012
27 BMW_1_Series_Convertible_2012
28 BMW_1_Series_Coupe_2012
29 BMW_3_Series_Sedan_2012
30 BMW_3_Series_Wagon_2012
31 BMW_6_Series_Convertible_2007
32 BMW_X5_SUV_2007
33 BMW_X6_SUV_2012
34 BMW_M3_Coupe_2012
35 BMW_M5_Sedan_2010
36 BMW_M6_Convertible_2010
37 BMW_X3_SUV_2012
38 BMW_Z4_Convertible_2012
39 Bentley_Continental_Supersports_Conv._Convertible_2012
40 Bentley_Arnage_Sedan_2009
41 Bentley_Mulsanne_Sedan_2011
42 Bentley_Continental_GT_Coupe_2012
43 Bentley_Continental_GT_Coupe_2007
44 Bentley_Continental_Flying_Spur_Sedan_2007
45 Bugatti_Veyron_16.4_Convertible_2009
46 Bugatti_Veyron_16.4_Coupe_2009
47 Buick_Regal_GS_2012
48 Buick_Rainier_SUV_2007
49 Buick_Verano_Sedan_2012
50 Buick_Enclave_SUV_2012
51 Cadillac_CTS-V_Sedan_2012
52 Cadillac_SRX_SUV_2012
53 Cadillac_Escalade_EXT_Crew_Cab_2007
54 Chevrolet_Silverado_1500_Hybrid_Crew_Cab_2012
55 Chevrolet_Corvette_Convertible_2012
56 Chevrolet_Corvette_ZR1_2012
57 Chevrolet_Corvette_Ron_Fellows_Edition_Z06_2007
58 Chevrolet_Traverse_SUV_2012
59 Chevrolet_Camaro_Convertible_2012
60 Chevrolet_HHR_SS_2010
61 Chevrolet_Impala_Sedan_2007
62 Chevrolet_Tahoe_Hybrid_SUV_2012
63 Chevrolet_Sonic_Sedan_2012
64 Chevrolet_Express_Cargo_Van_2007
65 Chevrolet_Avalanche_Crew_Cab_2012
66 Chevrolet_Cobalt_SS_2010
67 Chevrolet_Malibu_Hybrid_Sedan_2010
68 Chevrolet_TrailBlazer_SS_2009
69 Chevrolet_Silverado_2500HD_Regular_Cab_2012
70 Chevrolet_Silverado_1500_Classic_Extended_Cab_2007
71 Chevrolet_Express_Van_2007
72 Chevrolet_Monte_Carlo_Coupe_2007
73 Chevrolet_Malibu_Sedan_2007
74 Chevrolet_Silverado_1500_Extended_Cab_2012
75 Chevrolet_Silverado_1500_Regular_Cab_2012
76 Chrysler_Aspen_SUV_2009
77 Chrysler_Sebring_Convertible_2010
78 Chrysler_Town_and_Country_Minivan_2012
79 Chrysler_300_SRT-8_2010
80 Chrysler_Crossfire_Convertible_2008
81 Chrysler_PT_Cruiser_Convertible_2008
82 Daewoo_Nubira_Wagon_2002
83 Dodge_Caliber_Wagon_2012
84 Dodge_Caliber_Wagon_2007
85 Dodge_Caravan_Minivan_1997
86 Dodge_Ram_Pickup_3500_Crew_Cab_2010
87 Dodge_Ram_Pickup_3500_Quad_Cab_2009
88 Dodge_Sprinter_Cargo_Van_2009
89 Dodge_Journey_SUV_2012
90 Dodge_Dakota_Crew_Cab_2010
91 Dodge_Dakota_Club_Cab_2007
92 Dodge_Magnum_Wagon_2008
93 Dodge_Challenger_SRT8_2011
94 Dodge_Durango_SUV_2012
95 Dodge_Durango_SUV_2007
96 Dodge_Charger_Sedan_2012
97 Dodge_Charger_SRT-8_2009
98 Eagle_Talon_Hatchback_1998
99 FIAT_500_Abarth_2012
100 FIAT_500_Convertible_2012
101 Ferrari_FF_Coupe_2012
102 Ferrari_California_Convertible_2012
103 Ferrari_458_Italia_Convertible_2012
104 Ferrari_458_Italia_Coupe_2012
105 Fisker_Karma_Sedan_2012
106 Ford_F-450_Super_Duty_Crew_Cab_2012
107 Ford_Mustang_Convertible_2007
108 Ford_Freestar_Minivan_2007
109 Ford_Expedition_EL_SUV_2009
110 Ford_Edge_SUV_2012
111 Ford_Ranger_SuperCab_2011
112 Ford_GT_Coupe_2006
113 Ford_F-150_Regular_Cab_2012
114 Ford_F-150_Regular_Cab_2007
115 Ford_Focus_Sedan_2007
116 Ford_E-Series_Wagon_Van_2012
117 Ford_Fiesta_Sedan_2012
118 GMC_Terrain_SUV_2012
119 GMC_Savana_Van_2012
120 GMC_Yukon_Hybrid_SUV_2012
121 GMC_Acadia_SUV_2012
122 GMC_Canyon_Extended_Cab_2012
123 Geo_Metro_Convertible_1993
124 HUMMER_H3T_Crew_Cab_2010
125 HUMMER_H2_SUT_Crew_Cab_2009
126 Honda_Odyssey_Minivan_2012
127 Honda_Odyssey_Minivan_2007
128 Honda_Accord_Coupe_2012
129 Honda_Accord_Sedan_2012
130 Hyundai_Veloster_Hatchback_2012
131 Hyundai_Santa_Fe_SUV_2012
132 Hyundai_Tucson_SUV_2012
133 Hyundai_Veracruz_SUV_2012
134 Hyundai_Sonata_Hybrid_Sedan_2012
135 Hyundai_Elantra_Sedan_2007
136 Hyundai_Accent_Sedan_2012
137 Hyundai_Genesis_Sedan_2012
138 Hyundai_Sonata_Sedan_2012
139 Hyundai_Elantra_Touring_Hatchback_2012
140 Hyundai_Azera_Sedan_2012
141 Infiniti_G_Coupe_IPL_2012
142 Infiniti_QX56_SUV_2011
143 Isuzu_Ascender_SUV_2008
144 Jaguar_XK_XKR_2012
145 Jeep_Patriot_SUV_2012
146 Jeep_Wrangler_SUV_2012
147 Jeep_Liberty_SUV_2012
148 Jeep_Grand_Cherokee_SUV_2012
149 Jeep_Compass_SUV_2012
150 Lamborghini_Reventon_Coupe_2008
151 Lamborghini_Aventador_Coupe_2012
152 Lamborghini_Gallardo_LP_570-4_Superleggera_2012
153 Lamborghini_Diablo_Coupe_2001
154 Land_Rover_Range_Rover_SUV_2012
155 Land_Rover_LR2_SUV_2012
156 Lincoln_Town_Car_Sedan_2011
157 MINI_Cooper_Roadster_Convertible_2012
158 Maybach_Landaulet_Convertible_2012
159 Mazda_Tribute_SUV_2011
160 McLaren_MP4-12C_Coupe_2012
161 Mercedes-Benz_300-Class_Convertible_1993
162 Mercedes-Benz_C-Class_Sedan_2012
163 Mercedes-Benz_SL-Class_Coupe_2009
164 Mercedes-Benz_E-Class_Sedan_2012
165 Mercedes-Benz_S-Class_Sedan_2012
166 Mercedes-Benz_Sprinter_Van_2012
167 Mitsubishi_Lancer_Sedan_2012
168 Nissan_Leaf_Hatchback_2012
169 Nissan_NV_Passenger_Van_2012
170 Nissan_Juke_Hatchback_2012
171 Nissan_240SX_Coupe_1998
172 Plymouth_Neon_Coupe_1999
173 Porsche_Panamera_Sedan_2012
174 Ram_C/V_Cargo_Van_Minivan_2012
175 Rolls-Royce_Phantom_Drophead_Coupe_Convertible_2012
176 Rolls-Royce_Ghost_Sedan_2012
177 Rolls-Royce_Phantom_Sedan_2012
178 Scion_xD_Hatchback_2012
179 Spyker_C8_Convertible_2009
180 Spyker_C8_Coupe_2009
181 Suzuki_Aerio_Sedan_2007
182 Suzuki_Kizashi_Sedan_2012
183 Suzuki_SX4_Hatchback_2012
184 Suzuki_SX4_Sedan_2012
185 Tesla_Model_S_Sedan_2012
186 Toyota_Sequoia_SUV_2012
187 Toyota_Camry_Sedan_2012
188 Toyota_Corolla_Sedan_2012
189 Toyota_4Runner_SUV_2012
190 Volkswagen_Golf_Hatchback_2012
191 Volkswagen_Golf_Hatchback_1991
192 Volkswagen_Beetle_Hatchback_2012
193 Volvo_C30_Hatchback_2012
194 Volvo_240_Sedan_1993
195 Volvo_XC90_SUV_2007
196 smart_fortwo_Convertible_2012

#提取 序号, 图片名, 类别, 属于测试集还是训练集(0,1表示)
import scipy.io
 
data = scipy.io.loadmat('data/cars_annos.mat')
annotations = data['annotations']
f_train = open('data/mat2txt.txt','w')

num = 1
for i in range(annotations.shape[1]):
    name = str(annotations[0,i][0])[2:-2]
    test  = int(annotations[0,i][6])
    clas = int(annotations[0,i][5])

    name = str(name)
    clas = str(clas)
    test = str(test)
    f_train.write(str(num) + ' ' + name + ' ' + clas + ' ' + test+'\n')
    num = num + 1
 
f_train.close()

#按分类重写文件夹
# coding=utf-8
import shutil
import sys
import scipy.io as scio
import os
import numpy as np

def cub2dang():
    if sys.getdefaultencoding() != 'utf-8':
        reload(sys)
        sys.setdefaultencoding('utf-8')
    data = scio.loadmat('data/cars_annos.mat')
   
    newPath_train = "data/train"
    newPath_test = "data/test"
    images = data['annotations'][0]
    classes = data['class_names'][0]
    num_images = images.size
    print(num_images)
    for i in range(0,int(num_images/2)):
        
        image_path = os.path.join('data/',images[i][0][0])
        
        file_name = images[i][0][0]  # 文件名
        file_name = file_name.split('/')[1].encode('utf-8')
        classid = images[i][5][0][0]  # 类别
        # classid=np.array2string(classid)
        classid = classid.astype(np.int32)
        id = classes[classid-1][0]
        file_name_new = os.path.join(id, file_name.decode())

        istest = images[i][6][0]  # train/test
        if istest:
            if not os.path.exists(os.path.join(newPath_test, id)):
                os.makedirs(os.path.join(newPath_test, id))
            shutil.copy(image_path, os.path.join(newPath_test, file_name_new))
            with open('data/car_test.txt', 'a') as f:
                f.write('{} {}\n'.format(file_name_new, classid))

        if not istest:
            if not os.path.exists(os.path.join(newPath_train, id)):
                os.makedirs(os.path.join(newPath_train, id))
            shutil.copy(image_path, os.path.join(newPath_train, file_name_new))
            with open('data/car_train.txt', 'a') as f:
                f.write('{} {}\n'.format(file_name_new, classid))
    print(i)
cub2dang()

16185
8091


#!python -m pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu                                                                                                    ✔  14:46:26
#!pip install torchvision==0.14.0

2.2 导入包

import matplotlib.pyplot as plt
import numpy as np

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.models as models
import torchvision.transforms as transforms

import time
import os
import tqdm
import PIL.Image as Image
from IPython.display import display

#print(torch.backends.mps.is_built())
device = torch.device("mps")
print(device)

#print(torch.cuda.get_device_name(device))

mps

3. 建模

3.1 加载数据

dataset_dir = "data/"


width, height = 224, 224
train_tfms = transforms.Compose([transforms.Resize((width, height)),
                                 torchvision.transforms.AutoAugment(),
                                 transforms.RandomHorizontalFlip(),
                                 transforms.RandomRotation(15),
                                 transforms.ToTensor(),
                                 transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
test_tfms = transforms.Compose([transforms.Resize((width, height)),
                                transforms.ToTensor(),
                                transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])

# create datasets
dataset = torchvision.datasets.ImageFolder(root = dataset_dir + "train", transform = train_tfms)
trainloader = torch.utils.data.DataLoader(dataset, batch_size = 32, shuffle = True, num_workers = 2)

dataset2 = torchvision.datasets.ImageFolder(root = dataset_dir + "test", transform = test_tfms)
testloader = torch.utils.data.DataLoader(dataset2, batch_size = 32, shuffle = False, num_workers = 2)

3.2 定义训练模型函数

def train_model(model, criterion, optimizer, scheduler, n_epochs=5):
    
  losses = []
  accuracies = []
  test_accuracies = []

  # 将模型初始化设为训练模式
  model.train()
  for epoch in tqdm.tqdm(range(n_epochs)):
    since = time.time()
    running_loss = 0.0
    running_correct = 0.0
    for i, data in enumerate(trainloader, 0):

      # 获取输入并将其分配给mps
      inputs, labels = data
      inputs = inputs.to(device)
      labels = labels.to(device)
      optimizer.zero_grad()
            
      # forward + backward + optimize
      outputs = model(inputs)
      _, predicted = torch.max(outputs.data, 1)
      loss = criterion(outputs, labels)
      loss.backward()
      optimizer.step()
            
      # 计算loss/acc 
      running_loss += loss.item()
      running_correct += (labels == predicted).sum().item()

    epoch_duration = time.time() - since
    epoch_loss = running_loss / len(trainloader)
    epoch_acc = 100 / 32 * running_correct / len(trainloader)
    print("Epoch %s, duration: %d s, loss: %.4f, acc: %.4f" % (epoch + 1, epoch_duration, epoch_loss, epoch_acc))
        
    losses.append(epoch_loss)
    accuracies.append(epoch_acc)
        
    # 将模式转化为评估模式使用test数据集
    model.eval()
    test_acc = eval_model(model)
    test_accuracies.append(test_acc)
        
    # 在验证后将模式改回训练
    model.train()
    scheduler.step(test_acc)
    
    since = time.time()
  print('Finished Training')
  return model, losses, accuracies, test_accuracies

3.3 定义模型评价函数

def eval_model(model):
  correct = 0.0
  total = 0.0
  with torch.no_grad():
    for i, data in enumerate(testloader, 0):
      images, labels = data
      images = images.to(device)
      labels = labels.to(device)
      outputs = model_ft(images)
      _, predicted = torch.max(outputs.data, 1)

      total += labels.size(0)
      correct += (predicted == labels).sum().item()

  test_acc = 100.0 * correct / total
  print('Accuracy of the network on the test images: %d %%' % (test_acc))
  return test_acc

4. 使用 ResNet34 实现¶

4.1 定义模型参数

#ResNet34:
NUM_CAR_CLASSES = 98

model_ft = models.resnet34(weights = True)

num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, NUM_CAR_CLASSES)
model_ft = model_ft.to(device)

criterion = nn.CrossEntropyLoss()

optimizer = optim.SGD(model_ft.parameters(), lr = 0.01, momentum = 0.9)

lrscheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode = 'max', patience = 3, threshold = 0.9)

/Users/gawaintan/miniforge3/envs/FastAi/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet34_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet34_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)

4.2 训练

# ResNet:
model_ft_res, training_losses_res, training_accs_res, test_accs_res = train_model(model_ft, criterion, optimizer, lrscheduler, n_epochs=12)

  0%|                                                                                                                                                                                                                  | 0/12 [00:00<?, ?it/s]

Epoch 1, duration: 109 s, loss: 4.1009, acc: 9.3994


  8%|████████████████▊                                                                                                                                                                                        | 1/12 [02:34<28:22, 154.73s/it]

Accuracy of the network on the test images: 24 %
Epoch 2, duration: 113 s, loss: 2.3972, acc: 34.3994


 17%|█████████████████████████████████▌                                                                                                                                                                       | 2/12 [05:15<26:25, 158.51s/it]

Accuracy of the network on the test images: 34 %
Epoch 3, duration: 117 s, loss: 1.7940, acc: 48.6328


 25%|██████████████████████████████████████████████████▎                                                                                                                                                      | 3/12 [07:58<24:03, 160.39s/it]

Accuracy of the network on the test images: 35 %
Epoch 4, duration: 114 s, loss: 1.4076, acc: 58.3984


 33%|███████████████████████████████████████████████████████████████████                                                                                                                                      | 4/12 [10:36<21:16, 159.59s/it]

Accuracy of the network on the test images: 48 %
Epoch 5, duration: 119 s, loss: 1.1524, acc: 64.6973


 42%|███████████████████████████████████████████████████████████████████████████████████▊                                                                                                                     | 5/12 [13:22<18:52, 161.84s/it]

Accuracy of the network on the test images: 48 %
Epoch 6, duration: 119 s, loss: 0.9919, acc: 69.3848


 50%|████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                                                                    | 6/12 [16:06<16:14, 162.44s/it]

Accuracy of the network on the test images: 61 %
Epoch 7, duration: 122 s, loss: 0.8873, acc: 72.2656


 58%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                                   | 7/12 [18:54<13:41, 164.33s/it]

Accuracy of the network on the test images: 61 %
Epoch 8, duration: 123 s, loss: 0.7274, acc: 77.3193


 67%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                   | 8/12 [21:42<11:01, 165.34s/it]

Accuracy of the network on the test images: 64 %
Epoch 9, duration: 127 s, loss: 0.4055, acc: 87.5000


 75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                                  | 9/12 [24:30<08:19, 166.45s/it]

Accuracy of the network on the test images: 80 %
Epoch 10, duration: 127 s, loss: 0.2865, acc: 91.0400


 83%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                 | 10/12 [27:22<05:35, 167.97s/it]

Accuracy of the network on the test images: 81 %
Epoch 11, duration: 131 s, loss: 0.2480, acc: 92.3828


 92%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                | 11/12 [30:20<02:51, 171.06s/it]

Accuracy of the network on the test images: 82 %
Epoch 12, duration: 139 s, loss: 0.2195, acc: 93.5547


100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [33:26<00:00, 167.24s/it]

Accuracy of the network on the test images: 82 %
Finished Training

该模型在12个epoch训练后训练准确度到了94%，测试准确度到了82%。

# 画统计图
f, axarr = plt.subplots(2,2, figsize = (12, 8))
axarr[0, 0].plot(training_losses_res)
axarr[0, 0].set_title("Training loss")
axarr[0, 1].plot(training_accs_res)
axarr[0, 1].set_title("Training acc")
axarr[1, 0].plot(test_accs_res)
axarr[1, 0].set_title("Test acc")

Text(0.5, 1.0, 'Test acc')

基于Stanford Cars的ResNet和GoogLeNet图像识别
)

4.3 在单张图片上评估模型

这个是按实际场景预测

# 关联文件和它的类名

def find_classes(dir):
    classes = os.listdir(dir)
    classes.sort()
    class_to_idx = {classes[i]: i for i in range(len(classes))}
    return classes, class_to_idx
classes, c_to_idx = find_classes(dataset_dir + "train")

# 定义函数读取图片
def single_img_eval(model_ft, imgdir):

  # 把模式切换到评估
  model_ft.eval()

  # transforms输入的图片
  loader = transforms.Compose([transforms.Resize((400, 400)),
                                  transforms.ToTensor(),
                                  transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
  image = Image.open(dataset_dir + imgdir)
  image = loader(image).float()
  #这里会默认cpu，若是cuda或者mps一定要加to(device)
  image = torch.autograd.Variable(image, requires_grad=True).to(device) 
  
  image = image.unsqueeze(0)
  #image = image.mps()
  output = model_ft(image)
  conf, predicted = torch.max(output.data, 1)

  display(Image.open(dataset_dir + imgdir))
  print(classes[predicted.item()], "confidence: ", conf.item())

single_img_eval(model_ft_res, "test/Chevrolet Traverse SUV 2012/004695.jpg")

基于Stanford Cars的ResNet和GoogLeNet图像识别

Buick Verano Sedan 2012 confidence:  5.017297267913818

single_img_eval(model_ft_res, "test/FIAT 500 Abarth 2012/008089.jpg")

基于Stanford Cars的ResNet和GoogLeNet图像识别

FIAT 500 Abarth 2012 confidence:  11.281584739685059

single_img_eval(model_ft_res, "test/Aston Martin V8 Vantage Convertible 2012/000623.jpg")

基于Stanford Cars的ResNet和GoogLeNet图像识别

Aston Martin V8 Vantage Convertible 2012 confidence:  9.173359870910645

我们从 ResNet34 采用的网络已成功对三辆汽车中的两辆进行了高置信度分类。对于第一辆车，该网络未能区分Chevrolet和Buick两类车。

4.4 保存模型以备将来使用

torch.save(model_ft_res.state_dict(), 'car_model_resnet.pth')

5. 使用 GoogLeNet 实现

5.1 定义模型参数

# GoogLeNet:
NUM_CAR_CLASSES = 98

model_ft = models.googlenet(weights = True)

num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, NUM_CAR_CLASSES)
model_ft = model_ft.to(device)

optimizer = optim.SGD(model_ft.parameters(), lr = 0.01, momentum = 0.9)

lrscheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode = 'max', patience = 3, threshold = 0.9)

5.2 训练

# GoogLeNet:
model_ft_google, training_losses_google, training_accs_google, test_accs_google = train_model(model_ft, criterion, optimizer, lrscheduler, n_epochs=12)

  0%|                                                                                                                                                                                                                  | 0/12 [00:00<?, ?it/s]

Epoch 1, duration: 168 s, loss: 3.0864, acc: 23.5596


  8%|████████████████▊                                                                                                                                                                                        | 1/12 [03:22<37:07, 202.47s/it]

Accuracy of the network on the test images: 39 %
Epoch 2, duration: 164 s, loss: 2.0752, acc: 43.7256


 17%|█████████████████████████████████▌                                                                                                                                                                       | 2/12 [07:01<35:22, 212.23s/it]

Accuracy of the network on the test images: 51 %
Epoch 3, duration: 164 s, loss: 1.5181, acc: 56.3721


 25%|██████████████████████████████████████████████████▎                                                                                                                                                      | 3/12 [10:22<31:03, 207.08s/it]

Accuracy of the network on the test images: 62 %
Epoch 4, duration: 182 s, loss: 1.1773, acc: 65.8691


 33%|███████████████████████████████████████████████████████████████████                                                                                                                                      | 4/12 [14:01<28:15, 211.89s/it]

Accuracy of the network on the test images: 66 %
Epoch 5, duration: 130 s, loss: 0.9473, acc: 71.5088


 42%|███████████████████████████████████████████████████████████████████████████████████▊                                                                                                                     | 5/12 [16:49<22:52, 196.12s/it]

Accuracy of the network on the test images: 69 %
Epoch 6, duration: 106 s, loss: 0.6231, acc: 82.9834


 50%|████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                                                                    | 6/12 [19:15<17:54, 179.04s/it]

Accuracy of the network on the test images: 79 %
Epoch 7, duration: 135 s, loss: 0.5321, acc: 86.5234


 58%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                                   | 7/12 [22:10<14:47, 177.47s/it]

Accuracy of the network on the test images: 80 %
Epoch 8, duration: 137 s, loss: 0.4982, acc: 86.7920


 67%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                                   | 8/12 [25:08<11:50, 177.72s/it]

Accuracy of the network on the test images: 81 %
Epoch 9, duration: 147 s, loss: 0.4649, acc: 88.9404


 75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                                  | 9/12 [28:16<09:03, 181.11s/it]

Accuracy of the network on the test images: 81 %
Epoch 10, duration: 154 s, loss: 0.4419, acc: 89.3799


 83%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                 | 10/12 [31:34<06:12, 186.29s/it]

Accuracy of the network on the test images: 81 %
Epoch 11, duration: 172 s, loss: 0.4061, acc: 89.9170


 92%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                | 11/12 [35:12<03:15, 195.96s/it]

Accuracy of the network on the test images: 81 %
Epoch 12, duration: 156 s, loss: 0.4121, acc: 89.9170


100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [38:28<00:00, 192.34s/it]

Accuracy of the network on the test images: 81 %
Finished Training

同样，我们观察到网络在epoch12 达到其峰值性能（90% 的训练准确率和 81% 的测试准确率）。

# plot the stats

f, axarr = plt.subplots(2,2, figsize = (12, 8))
axarr[0, 0].plot(training_losses_google)
axarr[0, 0].set_title("Training loss")
axarr[0, 1].plot(training_accs_google)
axarr[0, 1].set_title("Training acc")
axarr[1, 0].plot(test_accs_google)
axarr[1, 0].set_title("Test acc")

Text(0.5, 1.0, 'Test acc')

基于Stanford Cars的ResNet和GoogLeNet图像识别

5.3 在单幅图像上评估模型

single_img_eval(model_ft_google, "test/Chevrolet Traverse SUV 2012/004695.jpg")

基于Stanford Cars的ResNet和GoogLeNet图像识别

Chevrolet Impala Sedan 2007 confidence:  4.615828037261963

single_img_eval(model_ft_google, "test/FIAT 500 Abarth 2012/008089.jpg")

基于Stanford Cars的ResNet和GoogLeNet图像识别

FIAT 500 Abarth 2012 confidence:  6.266000747680664

single_img_eval(model_ft_google,  "test/Aston Martin V8 Vantage Convertible 2012/000623.jpg")

基于Stanford Cars的ResNet和GoogLeNet图像识别

Aston Martin V8 Vantage Convertible 2012 confidence:  6.059011459350586

对比ResNet34该网络识别出后两个，第一个也是识别错了，但是更接近一些，只是confidence低一些。

5.4 保存模型以备后用

torch.save(model_ft_google.state_dict(), 'car_model_googlenet.pth')

6. 结论

在这个项目中，我们在预训练的最先进的 ResNet34 和 GoogLeNet 网络上实施了“现成的”迁移学习技术。我们通过用未经训练的层替换它们的最后一个完全连接层来采用这些网络，然后冻结更深的层并使用我们的数据训练最后一层：斯坦福汽车数据，使用随机梯度下降。我们使用自适应学习率来优化训练过程。我们还对训练数据进行了扩充以避免过度拟合。

我们的结果不错，采用的两种网络都具有高性能：来自 ResNet34 的网络具有 93% 的训练准确率和 85% 的测试准确率；来自 GoogLeNet 的网络具有 90% 的训练准确率和 84% 的测试准确率。

最后，我们还使用单个图像来模拟真实生产场景来评估模型，并且模型表现良好。