Transfer learning is an important part of machine learning, which is widely used in various fields such as text and images. Here we mainly introduce transfer learning in the field of image classification, which is often called domain transfer, such as migration of the ImageNet classification model to the specified image classification task, such as flower classification.
ImageNet is the widely used dataset for image classification. A series of empirical hyperparameters have been summarized. High accuracy can be got using the hyperparameters. However, when applied in the specified dataset, the hyperparameters may not be optimal. There are two commonly used hyperparameter search methods that can be used to help us obtain better model hyperparameters.
For grid search, which is also called exhaustive search, the optimal value is determined by finding the best solution from all solutions in the search space. The method is simple and effective, but when the search space is large, it takes huge computing resource.
Bayesian search, which is also called Bayesian optimization, is realized by randomly selecting a group of hyperparameters in the search space. Gaussian process is used to update the hyperparameters, compute their expected mean and variance according to the performance of the previous hyperparameters. The larger the expected mean, the greater the probability of being close to the optimal solution. The larger the expected variance, the greater the uncertainty. Usually, the hyperparameter point with large expected mean is called exporitation
, and the hyperparameter point with large variance is called exploration
. Acquisition function is defined to balance the expected mean and variance. The currently selected hyperparameter point is viewed as the optimal position with maximum probability.
According to the above two search schemes, we carry out some experiments based on fixed scheme and two search schemes on 8 open source datasets. As the experimental scheme in [1], we search for 4 hyperparameters, the search space and The experimental results are as follows:
a fixed set of parameter experiments and two search schemes on 8 open source data sets. With reference to the experimental scheme of [1], we search for 4 hyperparameters, the search space and the experimental results are as follows:
lr=0.003,l2 decay=1e-4,label smoothing=False,mixup=False
lr: [0.1, 0.03, 0.01, 0.003, 0.001, 0.0003, 0.0001]
l2 decay: [1e-3, 3e-4, 1e-4, 3e-5, 1e-5, 3e-6, 1e-6]
label smoothing: [False, True]
mixup: [False, True]
It takes 196 times for grid search, and takes 10 times less for Bayesian search. The baseline is trained by using ImageNet1k pretrained model based on ResNet50_vd and fixed scheme. The follow shows the experiments.
Dataset | Fix scheme | Grid search | Grid search time | Bayesian search | Bayesian search time |
---|---|---|---|---|---|
Oxford-IIIT-Pets | 93.64% | 94.55% | 196 | 94.04% | 20 |
Oxford-102-Flowers | 96.08% | 97.69% | 196 | 97.49% | 20 |
Food101 | 87.07% | 87.52% | 196 | 87.33% | 23 |
SUN397 | 63.27% | 64.84% | 196 | 64.55% | 20 |
Caltech101 | 91.71% | 92.54% | 196 | 92.16% | 14 |
DTD | 76.87% | 77.53% | 196 | 77.47% | 13 |
Stanford Cars | 85.14% | 92.72% | 196 | 92.72% | 25 |
FGVC Aircraft | 80.32% | 88.45% | 196 | 88.36% | 20 |
In practical applications, due to the lack of training data, the classification model trained on the ImageNet1k data set is often used as the pretrained model for other image classification tasks. In order to further help solve practical problems, based on ResNet50_vd, Baidu open sourced a self-developed large-scale classification pretrained model, in which the training data contains 100,000 categories and 43 million pictures. The pretrained model can be downloaded as follows:download link
We conducted transfer learning experiments on 6 self-collected datasets,
using a set of fixed parameters and a grid search method, in which the number of training rounds was set to 20epochs, the ResNet50_vd model was selected, and the ImageNet pre-training accuracy was 79.12%. The comparison results of the experimental data set parameters and model accuracy are as follows:
Fixed scheme:
lr=0.001,l2 decay=1e-4,label smoothing=False,mixup=False
Dataset | Statstics | Pretrained moel on ImageNet Top-1(fixed)/Top-1(search) |
Pretrained moel on large-scale dataset Top-1(fixed)/Top-1(search) |
---|---|---|---|
Flowers | class:102 train:5789 valid:2396 |
0.7779/0.9883 | 0.9892/0.9954 |
Hand-painted stick figures | Class:18 train:1007 valid:432 |
0.8795/0.9196 | 0.9107/0.9219 |
Leaves | class:6 train:5256 valid:2278 |
0.8212/0.8482 | 0.8385/0.8659 |
Container vehicle | Class:115 train:4879 valid:2094 |
0.6230/0.9556 | 0.9524/0.9702 |
Chair | class:5 train:169 valid:78 |
0.8557/0.9688 | 0.9077/0.9792 |
Geology | class:4 train:671 valid:296 |
0.5719/0.8094 | 0.6781/0.8219 |
[1] Kornblith, Simon, Jonathon Shlens, and Quoc V. Le. "Do better imagenet models transfer better?." Proceedings of the IEEE conference on computer vision and pattern recognition. 2019.
[2] Kolesnikov, Alexander, et al. "Large Scale Learning of General Visual Representations for Transfer." arXiv preprint arXiv:1912.11370 (2019).
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。