您当前的位置:首页 > IT编程 > cnn卷积神经网络
| C语言 | Java | VB | VC | python | Android | TensorFlow | C++ | oracle | 学术与代码 | cnn卷积神经网络 | gnn | 图像修复 | Keras | 数据集 | Neo4j | 自然语言处理 | 深度学习 | 医学CAD | 医学影像 | 超参数 | pointnet | pytorch |

自学教程:【深度学习框架Keras】在小数据集上训练图片分类模型的技巧

51自学网 2020-02-28 14:34:36
  cnn卷积神经网络
这篇教程【深度学习框架Keras】在小数据集上训练图片分类模型的技巧写得很实用,希望能帮到您。
【深度学习框架Keras】在小数据集上训练图片分类模型的技巧

    1.主要参考Francois Chollet《Deep Learning with Python》;

    2.代码运行环境为kaggle中的kernels;

    3.数据集和预训练模型VGG-16需要手动添加;

    4.卷积神经网络请参考:【深度学习】:卷积神经网络(CNN)

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os,shutil
print(os.listdir("../input"))
# Any results you write to the current directory are saved as output.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13

一、创建训练、验证和测试样本的目录

base_dir = '..'
# 创建训练样本目录
train_dir = os.path.join(base_dir,'train')
os.mkdir(train_dir)
# 创建验证样本目录
validation_dir = os.path.join(base_dir,'validation')
os.mkdir(validation_dir)
# 创建测试样本目录
test_dir = os.path.join(base_dir,'test')
os.mkdir(test_dir)


# 创建训练样本目录中的cats目录
train_cats_dir = os.path.join(train_dir,'cats')
os.mkdir(train_cats_dir)
# 创建训练样本目录中的dogs目录
train_dogs_dir = os.path.join(train_dir,'dogs')
os.mkdir(train_dogs_dir)

# 创建验证样本目录中的cats目录
validation_cats_dir = os.path.join(validation_dir,'cats')
os.mkdir(validation_cats_dir)
# 创建验证样本目录中的dogs目录
validation_dogs_dir = os.path.join(validation_dir,'dogs')
os.mkdir(validation_dogs_dir)

# 创建测试样本目录中的cats目录
test_cats_dir = os.path.join(test_dir,'cats')
os.mkdir(test_cats_dir)
# 创建测试样本目录中的dogs目录
test_dogs_dir = os.path.join(test_dir,'dogs')
os.mkdir(test_dogs_dir)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32

二、拷贝数据到对应的目录

    2000张训练图片,1000张验证图片,1000张测试图片,其中猫和狗各占一半

original_dataset_dir = '../input/dogs-vs-cats/train/train'
# 拷贝前1000张猫的图片到train_cats_dir中
fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir,fname)
    dst = os.path.join(train_cats_dir,fname)
    shutil.copyfile(src,dst)
# 拷贝接下来的500张猫的图片到validation_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1000,1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir,fname)
    dst = os.path.join(validation_cats_dir,fname)
    shutil.copyfile(src,dst)
# 拷贝再接下来的500张猫的图片到test_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1500,2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir,fname)
    dst = os.path.join(test_cats_dir,fname)
    shutil.copyfile(src,dst)
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
# 拷贝前1000张狗的图片到train_cats_dir中
for fname in fnames:
    src = os.path.join(original_dataset_dir,fname)
    dst = os.path.join(train_dogs_dir,fname)
    shutil.copyfile(src,dst)
# 拷贝接下来的500张狗的图片到validation_cats_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000,1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir,fname)
    dst = os.path.join(validation_dogs_dir,fname)
    shutil.copyfile(src,dst)
# 拷贝再接下来的500张狗的图片到test_cats_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1500,2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir,fname)
    dst = os.path.join(test_dogs_dir,fname)
    shutil.copyfile(src,dst)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37

三、构建神经网络

from keras import layers
from keras import models
from keras import optimizers

model = models.Sequential()
# 输入的图形尺寸为150*150,channel为3
# filter尺寸为3*3,深度为32
# 该层的输出为:148*148*32
# 参数数量为:3*3*3*32+32=896
model.add(layers.Conv2D(32,(3,3),activation='relu',input_shape=(150,150,3)))
# 该层输出为:74*74*32
model.add(layers.MaxPool2D(2,2))
# filter尺寸为3*3,深度为64
# 该层的输出为:72*72*64
# 参数的数量为:3*3*32*64+64=18496
model.add(layers.Conv2D(64,(3,3),activation='relu'))
# 该层的输出为:36*36*64
model.add(layers.MaxPool2D(2,2))
# filter尺寸为3*3,深度为128
# 该层的输出为:34*34*128
# 参数的数量为:3*3*64*128+128=73856
model.add(layers.Conv2D(128,(3,3),activation='relu'))
# 该层的输出为:17*17*128
model.add(layers.MaxPool2D(2,2))
# filter尺寸为3*3,深度为128
# 该层的输出为:15*15*128
# 参数的数量为:3*3*128*128+128=147584
model.add(layers.Conv2D(128,(3,3),activation='relu'))
# 该层的输出为:7*7*128
model.add(layers.MaxPool2D(2,2))
model.add(layers.Flatten())
# 全连接层,参数的数量为:7*7*128*512+512 = 321176
model.add(layers.Dense(512,activation='relu'))
# 由于是二分类问题,最后一层只有一个激活函数为sigmoid的神经元
model.add(layers.Dense(1,activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
             metrics=['acc'])

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39

四、数据预处理

通常在数据被喂给神经网络前,应该将数据处理成类型为float的tensor。我们的数据是位于磁盘中的JPEG图片,因此需要做一些转换的工作:

    首先,读取图片文件到内存中;

    然后,解码JPEG图片使其转变为RGB像素网格;

    再后,将其转换成类型为float的tensor;

    将像素值缩放到[0,1]区间

上面说的这些操作在keras.preprocessing.image中有提供,其中ImageDataGenerator可以一步到位将上面的操作完成。
generator是一个迭代器,可以迭代的产生出数据。

from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255) # 将原始数据缩放到原来的1/255
test_datagen = ImageDataGenerator(rescale=1./255)
# 从指定的目录中产生批量的格式化数据
# target_size:所有图片经过处理后的尺寸
# 该generator每次返回一个20*150*150*3的张量和binary类型的标签(shape(20,))
train_generator = train_datagen.flow_from_directory(train_dir,
                                                   target_size=(150,150),
                                                   batch_size=20,
                                                   class_mode='binary')
validation_generator = test_datagen.flow_from_directory(validation_dir,
                                                       target_size=(150,150),
                                                       batch_size=20,
                                                       class_mode='binary')

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14

Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.

    1
    2

五、训练模型与保存模型

    使模型拟合generator要使用fit_generator()方法

    在fit时,要使模型知道经过多少个batch后是一整个epoch,这个值由steps_per_epoch给定。由于我们的例子中有2000样本,batch_size为20,因此steps_per_epoch为1000

    如果参数validation_data传入的也是generator,那么同样也要指定一整个epoch有多少个batch,这个值由参数validation_steps指定。我们的例子中验证样本有1000张图片,batch_size为20,因此validation_steps为50.

history = model.fit_generator(train_generator,
                             steps_per_epoch=100,
                             epochs=30,
                             validation_data=validation_generator,
                             validation_steps=50)
model.save('cats_and_dogs_small_1.h5')

    1
    2
    3
    4
    5
    6

Epoch 1/30
100/100 [==============================] - 14s 140ms/step - loss: 0.6885 - acc: 0.5440 - val_loss: 0.6912 - val_acc: 0.5020
...
Epoch 30/30
100/100 [==============================] - 10s 104ms/step - loss: 0.0409 - acc: 0.9885 - val_loss: 1.0625 - val_acc: 0.7210

    1
    2
    3
    4
    5

六、绘制准确率和损失曲线

import matplotlib.pyplot as plt
%matplotlib inline
def plot_curve(history):
    acc = history.history['acc']
    val_acc = history.history['val_acc']
    loss = history.history['loss']
    val_loss = history.history['val_loss']
    
    epochs = range(1,len(acc)+1)
    
    plt.plot(epochs,acc,'bo',label='Training acc')
    plt.plot(epochs,val_acc,'b',label='Validation acc')
    plt.title('Training and validation accuracy')
    plt.legend()
    
    plt.figure()
    plt.plot(epochs,loss,'bo',label='Training loss')
    plt.plot(epochs,val_loss,'b',label='Validation loss')
    plt.title('Training and validation loss')
    plt.legend()
    
plot_curve(history)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22

在这里插入图片描述

在这里插入图片描述
观察上面的曲线,发现准确率大约在0.73左右,而且明显出现了过拟合。由于我们的训练样本只有2000张,过拟合非常的正常。通常我们解决过拟合的方法是dropout和权重衰减,但下面我们使用“数据增强”的方式来抵抗过拟合。
七、数据增强

数据增强的思想就是使用已有的训练样本产生更多的训练样本。
1.配置ImageDataGenerator的数据增强参数来产生更多的样本数据

datagen = ImageDataGenerator(rotation_range=40, # 随机旋转角度的范围
                            width_shift_range=0.2, # 随机转换图片宽度的范围
                            height_shift_range=0.2, # 随机转换图片高度的范围
                            shear_range=0.2, # 随机剪切转换比例
                            zoom_range=0.2, # 随机放缩比例
                            horizontal_flip=True,# 开启水平翻转
                            fill_mode='nearest' # 填充策略
                            )

    1
    2
    3
    4
    5
    6
    7
    8

2.展示随机参数的增强数据

from keras.preprocessing import image
fnames = [os.path.join(train_cats_dir,fname) for fname in os.listdir(train_cats_dir)]
img_path = fnames[5] # 选择一张图片用于增加
img = image.load_img(img_path,target_size=(150,150)) # 读取图片并进行缩放
x = image.img_to_array(img) # 转换为(150,150,3)的Numpy数组
x = x.reshape((1,)+x.shape) # 重塑为(1,150,150,3)
i = 0
for batch in datagen.flow(x,batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i+=1
    if(i%4==0):
        break

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
3.定义一个带有dropout的卷积神经网络

model = models.Sequential()
model.add(layers.Conv2D(32,(3,3),activation='relu',input_shape=(150,150,3)))
model.add(layers.MaxPool2D((2,2)))
model.add(layers.Conv2D(64,(3,3),activation='relu'))
model.add(layers.MaxPool2D((2,2)))
model.add(layers.Conv2D(128,(3,3),activation='relu'))
model.add(layers.MaxPool2D((2,2)))
model.add(layers.Conv2D(128,(3,3),activation='relu'))
model.add(layers.MaxPool2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512,activation='relu'))
model.add(layers.Dense(1,activation='sigmoid'))
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
             metrics=['acc'])

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16

4.训练模型并保存

train_datagen = ImageDataGenerator(rescale=1./255,
                                  rotation_range=40,
                                  width_shift_range=0.2,
                                  height_shift_range=0.2,
                                  shear_range=0.2,
                                  zoom_range=0.2,
                                  horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(train_dir,
                                                   target_size=(150,150),
                                                   batch_size=32,
                                                   class_mode='binary')
validation_generator = test_datagen.flow_from_directory(validation_dir,
                                                       target_size=(150,150),
                                                       batch_size=32,
                                                       class_mode='binary')
history = model.fit_generator(train_generator,
                             steps_per_epoch=100,
                             epochs=100,
                             validation_data = validation_generator,
                             validation_steps=50)
model.save('cats_and_dogs_small_2.h5')

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22

Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Epoch 1/100
100/100 [==============================] - 34s 343ms/step - loss: 0.6957 - acc: 0.5144 - val_loss: 0.6873 - val_acc: 0.5025
Epoch 2/100
100/100 [==============================] - 31s 310ms/step - loss: 0.6796 - acc: 0.5509 - val_loss: 0.6696 - val_acc: 0.5715
...
Epoch 100/100
100/100 [==============================] - 31s 307ms/step - loss: 0.3395 - acc: 0.8556 - val_loss: 0.4766 - val_acc: 0.8093

    1
    2
    3
    4
    5
    6
    7
    8
    9

5、绘制准确率和损失曲线

plot_curve(history)

    1

在这里插入图片描述
在这里插入图片描述
可以看到经过dropout和数据增强,模型不再过拟合,而且准确率提升到了0.82
八、使用预训练模型

对于小数据集的图像问题,更常用的方法是使用“预训练模型”。预训练模型就是先前在大数据集上训练好的模型,在这个问题中我们将使用在ImageNet上训练好的VGG16。使用预训练模型有两种方法:特征提取和微调。
九.特征提取

卷积神经网主要是包含卷积层和池化层,最后再拼接全连接层,通常提取特征主要是使用模型的卷积层和池化层的输出,而不是最终全连接层的输出,因为全连接层的输出包含的信息少。

如果将要进行的任务与预训练模型的任务差距较大,那么尽量使用预训练模型中较浅层的输出。因为浅层的输出主要为一些具体的特征,比如线条、形状、颜色等;而深层输出的特征则更加抽象,更加贴近具体的问题,如“狗的眼睛”和“猫的耳朵”等特征。
1.加载VGG16预训练模型

from keras.applications import VGG16
conv_base = VGG16(weights='../input/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',include_top=False,input_shape=(150,150,3))
conv_base.summary()

    1
    2
    3

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 150, 150, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 150, 150, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 75, 75, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 75, 75, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 75, 75, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 37, 37, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 37, 37, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 18, 18, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 18, 18, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 9, 9, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45

2.不使用数据增强的快速特征提取

使用conv_base提取特征,使用提取的特征训练全连接网络进行分类

datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20
# 从目标directory中提取sample_count个样本
def extract_features(directory,sample_count):
    # 由于conv_base最后一层的输出是(4,4,512),因此features的维度也为(4,4,512)
    features = np.zeros(shape=(sample_count,4,4,512))
    labels = np.zeros(shape=(sample_count))
    generator = datagen.flow_from_directory(directory,
                                            target_size=(150,150),
                                           batch_size=batch_size,
                                           class_mode='binary')
    i = 0
    for inputs_batch,labels_batch in generator:
        # 使用conv_base预测一个batch的数据,做为预训练的特征
        features_batch = conv_base.predict(inputs_batch)
        # 将当前batch的预训练特征加入到features中
        features[i*batch_size:(i+1)*batch_size] = features_batch
        labels[i*batch_size:(i+1)*batch_size] = labels_batch
        i += 1
        if i*batch_size >= sample_count:
            break
    return features,labels

# 使用conv_base进行特征提取
train_features,train_labels = extract_features(train_dir,2000)
validation_features,validation_labels = extract_features(validation_dir,1000)
test_features,test_labels = extract_features(test_dir,1000)
# 转换样本维度
train_features = np.reshape(train_features,(2000,4*4*512))
validation_features = np.reshape(validation_features,(1000,4*4*512))
test_features = np.reshape(test_features,(1000,4*4*512))

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31

Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.

    1
    2
    3

训练一个全连接网络利用conv_base提取的特征进行分类

model = models.Sequential()
model.add(layers.Dense(256,activation='relu',input_dim=4*4*512))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1,activation='sigmoid'))
model.compile(optimizer=optimizers.RMSprop(lr=2e-5),
             loss='binary_crossentropy',
             metrics=['acc'])
history = model.fit(train_features,
                    train_labels,
                    epochs=30,
                   batch_size=20,
                   validation_data=(validation_features,validation_labels))

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12

Train on 2000 samples, validate on 1000 samples
Epoch 1/30
2000/2000 [==============================] - 1s 572us/step - loss: 0.6065 - acc: 0.6715 - val_loss: 0.4410 - val_acc: 0.8410
...
Epoch 30/30
2000/2000 [==============================] - 1s 380us/step - loss: 0.0880 - acc: 0.9725 - val_loss: 0.2372 - val_acc: 0.9000

    1
    2
    3
    4
    5
    6

plot_curve(history)

    1

在这里插入图片描述
在这里插入图片描述
准确率提升到了0.9
3.使用数据增强进行特征提取

在VGG16的conv_base基础上拼接用于分类的全连接层,然后将conv_base的参数固定,单独训练全连接层的参数

model = models.Sequential()
# 将conv_base的参数固定,即训练时不更新conv_base中的参数
# 否则整个模型都要重新训练
conv_base.trainable=False
# 在conv_base的基础上拼接一个全连接层
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256,activation='relu'))
model.add(layers.Dense(1,activation='sigmoid'))

train_datagen = ImageDataGenerator(rescale=1./255,
                                  rotation_range=40,
                                  width_shift_range=0.2,
                                  height_shift_range=0.2,
                                  shear_range=0.2,
                                  zoom_range=0.2,
                                  horizontal_flip=True,
                                  fill_mode='nearest')
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(train_dir,
                                                   target_size=(150,150),
                                                   batch_size=20,
                                                   class_mode='binary')
validation_generator = test_datagen.flow_from_directory(validation_dir,
                                                       target_size=(150,150),
                                                       batch_size=20,
                                                       class_mode='binary')
model.compile(loss='binary_crossentropy',
             optimizer=optimizers.RMSprop(lr=2e-5),
             metrics=['acc'])
history = model.fit_generator(train_generator,
                             steps_per_epoch=100,
                             epochs=30,
                             validation_data=validation_generator,
                             validation_steps=50)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35

Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Epoch 1/30
100/100 [==============================] - 28s 276ms/step - loss: 0.5772 - acc: 0.7040 - val_loss: 0.4572 - val_acc: 0.7970
...
Epoch 30/30
100/100 [==============================] - 25s 255ms/step - loss: 0.2828 - acc: 0.8820 - val_loss: 0.2387 - val_acc: 0.9040

    1
    2
    3
    4
    5
    6
    7

plot_curve(history)

    1

在这里插入图片描述

在这里插入图片描述
十.微调

另外一个重用模型的技术成为“微调”,其可以做为特征提取的补充。其主要包括下面几个步骤:

    1.在conv_base的基础上加入你设计的网络;

    2.固定conv_base的参数;

    3.训练你设计的那部分网络;

    4.将conv_base中后面若干层的参数解除固定;

    5.将两部分网络合在一起训练;

我们再前面已经完成了前3步,接下来再完成后2不即可。
1.固定除block5_conv1以外的所有层的参数

conv_base.trainable = True
set_trainable = False
for layer in conv_base.layers:
    if(layer.name=='block5_conv1'):
        set_trainable = True
    if(set_trainable):
        layer.trainable = True
    else:
        layer.trainable = False

    1
    2
    3
    4
    5
    6
    7
    8
    9

2.微调block5_conv1层

model.compile(loss='binary_crossentropy',
             optimizer=optimizers.RMSprop(lr=1e-5),
             metrics=['acc'])
history = model.fit_generator(train_generator,
                             steps_per_epoch=100,
                             epochs=100,
                             validation_data=validation_generator,
                             validation_steps=50)

    1
    2
    3
    4
    5
    6
    7
    8

Epoch 1/100
100/100 [==============================] - 32s 318ms/step - loss: 0.3015 - acc: 0.8680 - val_loss: 0.2487 - val_acc: 0.9010
Epoch 2/100
100/100 [==============================] - 31s 306ms/step - loss: 0.2454 - acc: 0.8975 - val_loss: 0.2247 - val_acc: 0.9180
...
Epoch 100/100
100/100 [==============================] - 27s 267ms/step - loss: 0.0199 - acc: 0.9920 - val_loss: 0.3396 - val_acc: 0.9340

    1
    2
    3
    4
    5
    6
    7

3.绘制曲线

plot_curve(history)

    1

在这里插入图片描述

在这里插入图片描述
十一、对测试集进行分类

test_generator = test_datagen.flow_from_directory(test_dir,
                                                 target_size=(150,150),
                                                 batch_size=20,
                                                 class_mode='binary')
test_loss,test_acc = model.evaluate_generator(test_generator,steps=50)
print('test_acc:',test_acc)

    1
    2
    3
    4
    5
    6

Found 1000 images belonging to 2 classes.
test_acc: 0.9429999935626984

    1
    2

十二、总结

1.神经网络在小样本数据集上非常容易过拟合。

2.使用dropout和数据增强(通过generator实现)可以有效的抑制过拟合。

3.使用预训练模型可以改善模型的表现。

4.使用预训练模型的方法一:使用预训练模型进行特征提取,之后使用自定义网络进行分类。

5.使用预训练模型的方法二:将预训练模型与自定义网络拼接在一起,然后固定预训练模型的参数,训练自定义网络的参数。

6.使用预训练模型的方法三:进行微调,即将5中训练完成的模型中预训练模型的一部分固定参数解除固定,在重新训练。
————————————————
版权声明:本文为CSDN博主「BQW_」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/bqw18744018044/article/details/83656320
keras自定义优化器
ranger优化器Installation
51自学网,即我要自学网,自学EXCEL、自学PS、自学CAD、自学C语言、自学css3实例,是一个通过网络自主学习工作技能的自学平台,网友喜欢的软件自学网站。
京ICP备13026421号-1