zuloothebest.blogg.se - We need to go deeper wrench

The second main reason is because we can. Allowing the network to perform more convolutions let it extract with more precision the features it “judges” relevant according to the dataset. When a network do a convolution on an input, it extracts a relevant feature (mostly the edges, shapes, colors, etc). The first reason is that a deeper model will convolve more the input data.

So the first question is: why do we have to go deeper? Even Google’s paper about its Inception module is called “Going deeper with convolutions”! Why do we keep going deeper? Using this, they were able to merge a Gated bi-Directional network to a Fast RCNN for doing both detection and classification.Įvolution of depth, error-rate, and number of parameters over the yearsįrom AlexNet, the first efficient CNN, which have 5 convolutional layers, to CUImage, the winner of the 2016’s edition, which uses a network with 269 layers, we definitely can speak about a real revolution of depth. The main improvement of their network(s) is the “Ensembles” approach, consisting in merging the learnings of multiple networks into a single one. Combined with a combination of 1x1 convolutional layers to reduce and then increase the number of features, this network can be extended to an incredible number of layers: the current best version has 152 layers. ResNet was the winner network in 2015, presenting a “skip connection” in the network, allowing to feed the output of a sequence of convolutional layers with its input. They are innovative by proposing a sequence of convolution, and some new CNN configurations. VGGNets, also are two great competitors of the 2014’s competition: VGG16 and VGG19 (16 and 19 convolutional layers), from Oxford. It also removes the fully connected layers, usually producing many additional parameters. It significantly reduced the number of parameters the network have to handle (from 60M for AlexNet to 4M).

GoogleNet, was the winner of this edition, introducing a new module: Inception. Overfeat, also released this year, presents another extension of AlexNet including an algorithm that learns to create the bounding boxes around the objects, for detection and classification. ZFNet, the ILSVRC 2013 winner, is basically a new version of AlexNet, adjusting the number of hyper-parameters by increasing the size of the convolutional layers, and reducing the stride / filter size for the first layer. It also introduced the use of ReLU as activation function, dropouts to avoid the overfitting, and max-pooling layers instead of simple pooling. Its architecture is very similar to LeNet (introduced by LeCun in the 1990s for hand-written patterns recognition), but is a larger and more complex network, able to learn complex objects using 5 convolutional layers. ILSVRC Challenge results 2012Īlexnet, released by Alex Krizhevsky, popularized CNN in computer vision.

Let’s introduce them and their main characteristics briefly. We need only look at the results of the ILSVRC challenge: since 2012, all the winners are CNN! Without going into the details of their architecture (a whole blog post would be relevant enough for that), the main improvement we can notice is mainly their deepness. In the last decade, deep Convolutional Neural Networks have consistently been achieving state-of-the-art performance in object recognition tasks, including classification, detection and segmentation.