A Comparative Study of Inception V3 and InceptionResNetV2 for Bathroom Item Classification with and without ImageNet Pretraining
DOI:
https://doi.org/10.61173/3c99b006Keywords:
Inception V3, InceptionResNetV2, ImageN-et pretraining, bathroom item datasetAbstract
Accurate classification of bathroom items presents notable challenges due to the objects’ small sizes, high visual similarity, and frequent background clutter. The investigation focuses on the influence of convolutional neural network architecture and pretraining strategy on the performance of image classification models in such fine-grained scenarios. Two widely used architectures, InceptionV3 and InceptionResNetV2, were selected for evaluation under two training regimes: training from scratch and transfer learning via ImageNet pretraining. A curated dataset containing ten categories of common bathroom items was used for training and testing. Model performance was quantitatively assessed using overall accuracy, macro-averaged precision, recall, and F1-score, alongside qualitative analysis through confusion matrices. Experimental results demonstrate that ImageNet pretraining can significantly enhance model performance across all metrics. InceptionResNetV2 with ImageNet weights achieved the highest accuracy of 96.19%, while models trained from random initialization showed unstable convergence and poor generalization, often collapsing into predicting a dominant class. The superior performance of pretrained models is attributed to the reuse of domain-invariant features learned from large-scale datasets, which serve as effective initializations for downstream tasks with limited labeled data. These findings confirm the effectiveness of transfer learning in small-sample visual classification and highlight the additional benefit of residual connections in deeper architectures when fine-tuning on domain-specific tasks.