Cipher's Blog

Never really desperate, only the lost of the soul.

MobileNetV2

MobileNet DepthWise Conv

Essay

Publish Date: 2022-07-12

Update Date: 2022-07-15

Read Count:

MobileNetV2: Inverted Residuals and Linear Bottlenecks

MobileNetV1遗留的问题

1、V1的结构问题：

MobileNet v1 的结构其实非常简单，论文里是一个非常复古的直筒结构，类似于VGG一样。这种结构的性价比其实不高，后续一系列的 ResNet, DenseNet 等结构已经证明通过复用图像特征，使用 Concat/Eltwise+ 等操作进行融合，能极大提升网络的性价比。

2、Depthwise Convolution的潜在问题：

MobileNet v1首次引入了深度可分离卷积，Depthwise Conv确实是大大降低了计算量，而且N×N Depthwise +1×1PointWise的结构在性能上也能接近N×N Conv。在实际使用的时候，我们发现Depthwise部分的kernel有不少是空的。当时我们认为，Depthwise每个kernel_dim相对于普通Conv要小得多，过小的kernel_dim，加上ReLU的激活影响下，使得神经元输出很容易变为0，所以就学废了。ReLU对于0的输出的梯度为0，所以一旦陷入0输出，就没法恢复了。我们还发现，这个问题在定点化低精度训练的时候会进一步放大。

Linear Bottleneck

为了解决信息损失问题，v2的解决方案是直接将每个Bottleneck最后一层的ReLU6换成线性函数，具体为v2网络中就是将最后的Point-Wise卷积的ReLU6都换成线性函数，也即是Linear Bottleneck。

ReLU6 -> Linear

Expansion layer

通过1x1PW卷积层进行升维

Expansion Layer

定义升维系数t

Inverted Residuals

MobileNet V1虽然采取了深度可分离卷积，但其网络主体仍然是VGG的直筒型结构。MobieNet V2借鉴了ResNet的残差结构，在v1网络结构基础上加入了跳跃连接。相较于ResNet的残差块结构，v2给这种结构命名为Inverted resdiual block，即倒残差块。倒残差主要体现在ResNet先降维，通道数变为原通道的1/4，但MobileNet v2升维为原通道数的6倍。

Resnet vs MobileNetV2

Architechture of MobileNetV2

Architechture of MobileNetV2

t 是输入通道的倍增系数（即中间部分的通道数是输入通道数的多少倍）

n 是该模块重复次数

c 是输出通道数

s 是该模块第一次重复时的 stride（之后重复都是 stride 1）

Summary

1.实时性和精度得到较好的平衡。

2.对于低维空间而言，进行线性映射会保存特征，而非线性映射会破坏特征。

cipher

https://dipperbill.github.io/2022/07/12/mobilenetv2/

All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source cipher !

MobileNet DepthWise Conv

Previous

Params and FLOPs

Params and FLOPs

Notes of Params and FLOPs Calculating

2022-07-13 Pytorch

Params Calculating FLOPs

Next

MobileNetV1

Notes of MobileNet v1

2022-07-09 Essay

MobileNet DepthWise Conv