Development of Unobtrusive Affective Computing Framework for Students’ Engagement Analysis in Classroom Environment
Date
2020
Authors
S, Ashwin T.
Journal Title
Journal ISSN
Volume Title
Publisher
National Institute of Technology Karnataka, Surathkal
Abstract
Pervasive intelligent learning environments can be made more personalized by adapting the teaching strategies according to the students’ emotional and behavioral engagements. The students’ engagement analysis helps to foster those emotions and behavioral
patterns that are beneficial to learning, thus improving the effectiveness of the teachinglearning process. The students’ emotional and behavioral patterns are to be recognized
unobtrusively using learning-centered emotions (engaged, confused, frustrated, and so
on), and engagement levels (looking away from the tutor or board, eyes completely
closed, and so on).
Recognizing both the behavioral and emotional engagement from students’ image
data in the wild (obtained from classrooms) is a challenging task. The use of the multitude of modalities enhances the performance of affective state classification, but recognizing facial expressions, hand gestures, and body posture of each student in a classroom environment is another challenge. Here, the classification of affective states is not
sufficient, object localization also plays a vital role. Both the classification and object
localization should be robust enough to perform better for various image variants such
as occlusion, background clutter, pose, illumination, cultural & regional background,
intra-class variations, cropped images, multipoint view, and deformations.
The most popular and state-of-the-art classification and localization techniques are
machine and deep learning techniques that depend on a database for the ground truth. A
standard database that contains data from different learning environments with a multitude of modalities is also required. Hence, in the research work, different deep learning
architectures are proposed to classify the students’ affective states with object localization. A standard database with students’ multimodal affective states is created and
benchmarked. The students’ affective states obtained from the proposed real-time affective state classification method is used as feedback to the teacher in order to enhance the
teaching-learning process in four different learning environments, namely: e-learning,
classrooms, webinars and flipped classrooms. More details on the contribution of this
thesis are as follows.A real-time students’ emotional engagement analysis is proposed for both the individual and group of students based on their facial expressions, hand gestures, and
body postures for e-learning, flipped classroom, classroom, and webinar environments.
Both basic and learning-centered emotions are used in the study. Various CNN based
architectures are proposed to predict the students’ emotional engagement. The students’ behavioral engagement analysis method is also proposed and implemented in the
classroom and computer-enabled teaching laboratories. The proposed scale-invariant
context assisted single-shot CNN architecture performed well for multiple students in a
single image frame. A single group engagement level score for each frame is obtained
using the proposed feature fusion technique.
The proposed model effectively classifies the students’ affective states into teachercentric attentive and in-attentive affective states. Inquiry interventions are proposed
to address the negative impact of in-attentive affective states on the performance of
students. Experimental results demonstrated a positive correlation between the students
learning rate and their attentive affective state engagement score for both individual and
group of students. Further, an affective state transition diagram and visualizations are
proposed to help the students and the teachers to improve the teaching-learning process.
A multimodal database is created for both e-learning (single student in a single image frame) and classroom environments (multiple students in a single image frame)
using the students’ facial expressions, hand gestures, and body postures. Both posed
and spontaneous expressions are collected to make the training set more robust. Also,
various image variants are considered during the dataset creation. Annotations are
performed using the gold standard study for eleven different affective states and four
different engagement levels. Object localization is performed on each modality of
every student, and the bounding box coordinates are stored along with the affective
state/engagement level. This database is benchmarked with various popular classification algorithms and state-of-the-art deep learning architectures
Description
Keywords
Department of Information Technology, Affective Computing, Affect Sensing and Analysis, Behavioral Patterns, Classroom Data in the Wild, Computer Vision, Multimodal Analysis, Student Engagement Analysis