Measuring Praise and Criticism: Inference of Semantic Orientation from Association

From Simple Sci Wiki
Jump to navigation Jump to search

Title: Measuring Praise and Criticism: Inference of Semantic Orientation from Association

Abstract: This research aims to develop an automated system for measuring semantic orientation, which indicates whether a word is used in a positive or negative context. The system is based on the statistical association of a word with a set of positive and negative paradigm words. Two methods, pointwise mutual information (PMI) and latent semantic analysis (LSA), are evaluated for their effectiveness in inferring semantic orientation. The system's accuracy is tested with 3,596 words, including adjectives, adverbs, nouns, and verbs, that have been manually labeled as positive or negative. The results show an accuracy of 82.8% for the full test set, but the accuracy rises above 95% when the algorithm is allowed to abstain from classifying mild words.

Research Question: Can we develop an automated system for measuring semantic orientation based on the statistical association of words with positive and negative paradigm words?

Methodology: The study uses two methods to infer semantic orientation: pointwise mutual information (PMI) and latent semantic analysis (LSA). PMI measures the statistical association between two words by calculating the probability of observing them together in a text. LSA, on the other hand, is a dimensionality reduction technique that captures the semantic relationships between words by projecting them onto a lower-dimensional space. Both methods are applied to a set of 3,596 words labeled as positive or negative.

Results: The study finds that both PMI and LSA can accurately infer semantic orientation. The PMI method achieves an accuracy of 82.8% on the full test set, while the LSA method performs even better, with an accuracy of 84.4%. However, the accuracy of both methods increases significantly when they are allowed to abstain from classifying mild words.

Implications: The development of an automated system for measuring semantic orientation has practical applications in various fields, such as text classification, text filtering, opinion tracking in online discussions, and automated chat systems. The study's findings suggest that both PMI and LSA can be used to accurately infer semantic orientation, which can help improve the performance of these applications.

Conclusion: In conclusion, this research has demonstrated that an automated system can be developed for measuring semantic orientation based on the statistical association of words with positive and negative paradigm words. Both PMI and LSA methods were found to be effective in inferring semantic orientation, with the LSA method performing slightly better. The study's findings have important implications for various applications that rely on semantic understanding of text.

Link to Article: https://arxiv.org/abs/0309034v1 Authors: arXiv ID: 0309034v1