Let's code and train VGGNet from scratch! In this post, I will explain the process of implementing this iconic CNN from designing a general architecture and using dense evaluation to optimizing training speed and actually training the network to obtain a validation top-1 and top-5 error rates of 28.33% and 9.66% respectively. I will also compare the error rates and training performance against the original paper and AlexNet.
Continuing our exploration of foundational deep learning models in computer vision, we will dive into the 2014 paper Very Deep Convolutional Networks for Large-Scale Image Recognition by Karen Simonyan and Andrew Zisserman, which introduced VGGNet, a set of simple yet highly performant networks. We will examine its architecture, data processing, training, testing, and analysis of the results as a preliminary step toward implementing it.