Beyond binary hyperparameters in deep transfer learning for image classification




Plested, Josephine

Journal Title

Journal ISSN

Volume Title



Convolutional neural networks (CNNs) have achieved many successes in image classification in recent years. It has been consistently demonstrated that CNNs work best when there is abundant labelled data available for the task and large deep models can be trained. However, there are many real world scenarios where the requirement for large amounts of training data to get the best performance using modern very deep neural networks cannot be met. In these scenarios transfer learning can help improve performance. This thesis defines deep transfer learning and the problem it attempts to solve in relation to image classification. It provides a new taxonomy of the applications of transfer learning for image classification. This taxonomy makes it easier to see overarching patterns where transfer learning has been effective and where it has failed to fulfill its potential. This thesis also provides the first comprehensive review of deep transfer learning as it relates to image classification overall and suggestions for future research directions in the field. While there have been recent general surveys of deep transfer learning, and ones that relate to particular specialised target image classification tasks, there have been none that review the area as a whole. It is important for future progress in the field that all current knowledge on deep transfer learning in image classification is collated and the overarching patterns are analysed and discussed. The aim of this thesis is to advance the state-of-the-art in deep transfer learning as it relates to image classification. This applies particularly for datasets where the number of training examples is small. Its main contributions are 1) improvements to best practice that go beyond the standard binary transfer learning practices, 2) heuristics that can be used to predict optimal non-binary transfer learning hyperparameters, and 3) a model that can learn optimal non-binary transfer learning hyperparameters. Experimental results are provided on a variety of image classification datasets with few training examples. These datasets range from very closely related to the source dataset to far less related. The benefits of the non-binary approach presented are supported by final results that come close to or exceed state of the art performance on a variety of target datasets that traditionally transfer learning has not performed well on.






Thesis (PhD)

Book Title

Entity type

Access Statement

License Rights



Restricted until