Food classification serves as the basic step of image-based dietary assessment to predict the types of foods in each input image. However, foods in real-world scenarios are typically long-tail distributed, where a small number of food types are consumed more frequently than others, which causes a severe class imbalance issue and hinders the overall performance. In addition, none of the existing long-tailed classification methods focus on food data, which can be more challenging due to the inter-class similarity and intra-class diversity between food images. In this work, two new benchmark datasets for long-tailed food classification are introduced, including Food101-LT and VFN-LT, where the number of samples in VFN-LT exhibits real-world long-tailed food distribution. Then, a novel two-phase framework is proposed to address the problem of class imbalance by (1) undersampling the head classes to remove redundant samples along with maintaining the learned information through knowledge distillation and (2) oversampling the tail classes by performing visually aware data augmentation. By comparing our method with existing state-of-the-art long-tailed classification methods, we show the effectiveness of the proposed framework, which obtains the best performance on both Food101-LT and VFN-LT datasets. The results demonstrate the potential to apply the proposed method to related real-life applications.
Keywords: benchmark datasets; food classification; food consumption frequency; image-based dietary assessment; long-tail distribution; neural networks.