VGG：使用块的卷积神经网络

VGG 简介

VGG（Visual Geometry Group）网络是一种卷积神经网络模型，由牛津大学的视觉几何组和谷歌DeepMind共同提出，它在2014年的ImageNet挑战赛中取得了优异的成绩。VGG网络以其简单而有效的结构而著称，其核心思想是通过堆叠多个小尺寸的卷积核（如3x3）来构建深层网络，从而减少模型的参数数量，同时保持了网络的深度和性能。

conv的stride为1，padding为1
maxpool的size为2，stride为2

VGG16

VGG16包含13个卷积层和3个全连接层，因此得名“VGG16”。这些卷积层和全连接层都具有权重系数，而池化层不涉及权重，因此不计入权重层的总数。

VGG 的优势

通过堆叠多个3x3的卷积核来替代大尺度卷积核(减少所需参数)

论文中提到，可以通过堆叠两个3x3的卷积核替代5x5的卷积核，堆叠三个3x3的卷积核替代7x7的卷积核。（拥有相同的感受野）

感受野

在卷积神经网络中，决定某一层输出结果中一个元素所对应的输入层的区域大小，被称作感受野(receptive field)。通俗的解释是，输出feature map.上的一个单元对应输入层上的区域大小。

论文中提到，可以通过堆叠两个3x3的卷积核替代5x5的卷积核，堆叠三个3x3的卷积核替代7x7的卷积核。使用7x7卷积核所需参数，与堆叠三个3x3卷积核所需参数（假设输入输出channel为C) * 7×7×C×C=49C^2 * 3×3×C×C+3×3×C×C+3×3×C×C=27C^2

VGG模型构建

# !/usr/bin/env python3
# -*- coding: utf-8 -*-
# ********************************************************************************************************************
#       Created:     2024/07/30
#       Filename:    GoogLeNet_model.py
#       Email:       72110902110jq@gmail.com
#       Create By:   coderfjq
#       LastModify:  2024/07/30
# ********************************************************************************************************************
# This code sucks, you know it and I know it.  
# Move on and call me an idiot later.
import torch.nn as nn
import torch

# official pretrain weights
model_urls = {
    'vgg11': 'https://download.pytorch.org/models/vgg11-bbd30ac9.pth',
    'vgg13': 'https://download.pytorch.org/models/vgg13-c768596a.pth',
    'vgg16': 'https://download.pytorch.org/models/vgg16-397923af.pth',
    'vgg19': 'https://download.pytorch.org/models/vgg19-dcbb9e9d.pth'
}


class VGG(nn.Module):
    def __init__(self, features, num_classes=1000, init_weights=False):
        super(VGG, self).__init__()
        self.features = features
        self.classifier = nn.Sequential(
            nn.Linear(512*7*7, 4096),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, num_classes)
        )
        if init_weights:
            self._initialize_weights()

    def forward(self, x):
        # N x 3 x 224 x 224
        x = self.features(x)
        # N x 512 x 7 x 7
        x = torch.flatten(x, start_dim=1)
        # N x 512*7*7
        x = self.classifier(x)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                nn.init.xavier_uniform_(m.weight)
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.xavier_uniform_(m.weight)
                # nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)


def make_features(cfg: list):
    layers = []
    in_channels = 3
    for v in cfg:
        if v == "M":
            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        else:
            conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
            layers += [conv2d, nn.ReLU(True)]
            in_channels = v
    return nn.Sequential(*layers)


cfgs = {
    'vgg11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'vgg13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'vgg16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
    'vgg19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],
}


def vgg(model_name="vgg16", **kwargs):
    assert model_name in cfgs, "Warning: model number {} not in cfgs dict!".format(model_name)
    cfg = cfgs[model_name]

    model = VGG(make_features(cfg), **kwargs)
    return model

Deep_Learning

#深度学习 #VGG

VGG：使用块的卷积神经网络

https://fu-jingqi.github.io/2024/07/25/VGG：使用块的卷积神经网络/

作者

coderfjq

发布于

2024年7月25日

许可协议

GoogLeNet：含并行连结的网络上一篇

AlexNet：深度卷积神经网络下一篇