本科自学的 CNN 忘得差不多了,最近机器学习的作业用到 CNN (传统机器学习课不讲深度学习啊啊啊啊),所以还是自学复习了一下这方面的内容。一些之前没搞懂的点也弄明白了,记录一下(不然明天就又忘完了)
主要参考:
why CNN?
-
The patterns they learn are translation invariant. 翻译无能。。
After learning a certain pattern in the lower-right corner of a picture, a convnet can recognize it anywhere: for example, in the upper-left corner. A densely connected network would have to learn the pattern anew if it appeared at a new location. This makes convnets data efficient when processing images (because the visual world is fundamentally translation invariant): they need fewer training samples to learn representations
-
CNN 可以学习有层级的特征(hierachies of patterns)。
This allows convnets to efficiently learn increasingly complex and abstract visual concepts (because the visual world is fundamentally spatially hierarchical).
卷积层(convolution layer)
参数数量的计算(到底是怎么连的)
若一个卷积层输入有n个通道、输出m个通道,卷积核(convolution kernel)为 3*3 ,则这一层的参数共有 n*m*9+m 个参数(m*n个卷积核)作用:
提取特征,本质上是一个滤波器padding
池化层(pooling layer)
为什么要有池化层?
- 为了更好地学习层级的特征(hierachy of features):没有如果降采样,后面的卷基层只能接触到很少的输入信息。而我们需要最后一个卷积层几乎学习到所有的输入信息。
- 减少参数的个数,加快学习进程。
如何进行池化?
-
max pooling:
works best, why? Features tend to encode the spatial presence of some pattern or concept over the different tiles of the feature map, and it’s more informative to look at the maximal presence of different features than at their average presence.
- average pooling
- convolution with stride
全连接层(fully connected layer)
作用:
在网上查阅了一些资料,但是详细讲全连接层的很少。我的理解:前面通过卷积层和池化层获取了输入的高维特征,输出应该是这些高维特征的非线性组合,而全连接的神经网络可以对非线性映射进行比较好的学习。但是全连接层可能会参数冗余,NIN 提出了GAP(global average pooling)来取代FCKeras
Keras 提供了搭建神经网络的高级API,底层默认用 Tensorflow 实现。
一个完整的简单的CNN
from keras import layers from keras import modelsmodel = models.Sequential()model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Conv2D(64, (3, 3), activation='relu'))>>> model.summary()________________________________________________________________Layer (type) Output Shape Param #================================================================conv2d_1 (Conv2D) (None, 26, 26, 32) 320________________________________________________________________maxpooling2d_1 (MaxPooling2D) (None, 13, 13, 32) 0________________________________________________________________conv2d_2 (Conv2D) (None, 11, 11, 64) 18496________________________________________________________________maxpooling2d_2 (MaxPooling2D) (None, 5, 5, 64) 0________________________________________________________________conv2d_3 (Conv2D) (None, 3, 3, 64) 36928================================================================Total params: 55,744Trainable params: 55,744Non-trainable params: 0model.add(layers.Flatten())model.add(layers.Dense(64, activation='relu'))model.add(layers.Dense(10, activation='softmax'))