Linguistic-Based Mild Cognitive Impairment Detection Using Informative Loss

Introduction

I proposed a deep learning approach employing Natural Language Processing (NLP) techniques to differentiate between Mild Cognitive Impairment (MCI) and Normal Cognitive (NC) conditions in older adults. In this study, I presented a framework that analyzed transcripts obtained from video interviews conducted as part of the I-CONECT study project, a randomized controlled trial aimed at improving cognitive functions through video chats. My NLP framework comprised two Transformer-based modules: Sentence Embedding (SE) and Sentence Cross Attention (SCA). Initially, I utilized the SE module to capture contextual relationships among words within each sentence, while the SCA module extracted temporal features from sequences of sentences. These features were then employed by a Multi-Layer Perceptron (MLP) for subject classification into MCI or NC categories. To ensure the robustness of my model, I introduced a novel loss function called InfoLoss, which considered entropy reduction observed in each sentence sequence to enhance classification accuracy. The comprehensive evaluation of my model using the I-CONECT dataset demonstrated that my framework achieved an average area under the curve of 84.75% in distinguishing between MCI and NC.

Architecture

The framework comprises two Transformer-based modules: Sentence Embedding (SE) and Sentence Cross Attention (SCA). Initially, the SE module processes each part of the transcript from each subject's interview to capture the relationship between words, ultimately creating corresponding embedding vectors. To elaborate, within each interview, we have a collection of sentences, which may be complete sentences or phrases. For each of these sentences, the SE module is applied to generate the corresponding sentence embeddings. These embeddings are treated as a time series, upon which the SCA module operates to extract linguistic features for distinguishing between MCI and NC.

arch

InfoLoss

While the Cross-Entropy (CE) loss function is commonly used for classification tasks, I aim to enhance the discriminative capability of our framework. Hence, I introduce a novel loss function termed Informative Loss (InfoLoss), which considers the uncertainty associated with ground truth labels during the classification process. My proposed InfoLoss incorporates the number of sentences generated by each participant in their interview as an uncertainty factor. I demonstrate that incorporating these uncertainty factors leads to more accurate classification results.


Accuracy and AUC:

arch

Confusion Matrix:

arch

Accuracy with respect to gender:

arch