Pre-trained models are models that contain feature-extraction methods with parameters that are already defined. Models with more parameters tend to be more accurate, but slower and computationally expensive. The opposite is true for models with fewer parameters; they tend to be less accurate, but faster and computationally cheap.
Here are simplified explanations of the pre-trained models included in the tool. Keep in mind that the performance of these models drastically depends on your data, so the summaries won't always be true.
- VGG16 tends to be the most accurate, slowest, and most computationally expensive.
- InceptionResNetV2 tends to balance accuracy, speed, and computational expense, with some bias toward accuracy.
- Resnet50V2 tends to balance of accuracy, speed, and computational expense, with some bias toward speed and less computational expense.
- InceptionV3 tends to be the least accurate (but still quite accurate), fastest, and least computationally expensive.
Each of those models was trained on a dataset that contained over 14 million images with more than 20,000 labels.
Choosing a pre-trained model allows you to skip training an entire neural network using your own images. When you choose to use a pre-trained model, you're effectively assuming that your input parameters match what the pre-trained model expects, so you don't need to rebuild a model that does about the same thing as the pre-trained one (and might even perform worse). Because many of the features from images tend to be the same as the ones the models have used during training, often you can safely assume that a pre-trained model will work with your input.
Use a pre-trained model when you have images with features that match what the pre-trained model expects and want to avoid training your own model.