Authors: Tanisha Khurana, Kaushik Pillalamarri, Vikram Pande, Munindar Singh
Abstract: This paper explores humor detection through a linguistic lens, prioritizing
syntactic, semantic, and contextual features over computational methods in
Natural Language Processing. We categorize features into syntactic, semantic,
and contextual dimensions, including lexicons, structural statistics, Word2Vec,
WordNet, and phonetic style. Our proposed model, Colbert, utilizes BERT
embeddings and parallel hidden layers to capture sentence congruity. By
combining syntactic, semantic, and contextual features, we train Colbert for
humor detection. Feature engineering examines essential syntactic and semantic
features alongside BERT embeddings. SHAP interpretations and decision trees
identify influential features, revealing that a holistic approach improves
humor detection accuracy on unseen data. Integrating linguistic cues from
different dimensions enhances the model’s ability to understand humor
complexity beyond traditional computational methods.
Source: http://arxiv.org/abs/2408.06335v1