CMOSE: Comprehensive Multi-Modality Online Student Engagement Dataset with High-Quality Labels

CVPR, ABAW Workshop 2024


Chi Hsuan Wu1, Shih-yang Liu1, Xijie Huang1, Xingbo Wang1, Rong Zhang1, Luca Minciullo2, Wong Kai Yiu2, Kenny Kwan2, Kwang-Ting Cheng1,

Abstract: Online learning is a rapidly growing industry. However, a major doubt about online learning is whether students are as engaged as they are in face-to-face classes. An engagement recognition system can notify the instructors about the student's condition and improve the learning experience. Current challenges in engagement detection involve poor label quality, extreme data imbalance, and intra-class variety - the variety of behaviors at a certain engagement level. To address these problems, we present the CMOSE dataset, which contains a large number of data from different engagement levels and high-quality labels annotated according to psychological advice. We also propose a training mechanism MocoRank to handle the intra-class variety and the ordinal pattern of different degrees of engagement classes. MocoRank outperforms prior engagement detection frameworks, achieving a 1.32% increase in overall accuracy and 5.05% improvement in average accuracy. Further, we demonstrate the effectiveness of multi-modality in engagement detection by combining video features with speech and audio features. The data transferability experiments also state that the proposed CMOSE dataset provides superior label quality and behavior diversity.

Supplementary video

BibTeX

@InProceedings{Wu_2024_CVPR,
      author={Wu, Chi-Hsuan and Liu, Shih-Yang and Huang, Xijie and Wang, Xingbo and Zhang, Rong and Minciullo, Luca and Yiu, Wong Kai and Kwan, Kenny and Cheng, Kwang-Ting},
      title={CMOSE: Comprehensive Multi-Modality Online Student Engagement Dataset with High-Quality Labels},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
      month={June},
      year={2024},
      pages={4636-4645}
  }