Abstract
|
With the emergence of social media that includes billions of people from around of the world interacting and sharing data, information, feelings and opinions among themselves on various topics, there is a huge amount of social media data generated by users to explore opinions and analyze emotions. Sentiment analysis aims to classify the polarity of a text based on the writer's opinion, by revealing the positive, negative or neutral sentiments about a particular subject. It is used in marketing, customer service and other fields. State of art approaches for Sentiment analysis are classified in two categories: the first one depends on Machine learning techniques and data mining by training a model on a set of labeled data. Whereas, the second category lexicon-based ones give specific weights for each word according to polarity of the word which it belongs, and thus identify the sentiments by comparing text words with pre-prepared lexicons.
This study relies on the methodology of sentiment analysis based on the lexicon-based, it focuses on five of the most important and well-known lexicons used in the field of sentiment analysis on Twitter data, such as(VADER, SentiWordNet, SentiStrength, Liu and Hu opinion lexicon and AFINN-111).
It provides an assessment of the performance of these lexicons in Twitter polarity classification by comparing the overall classification accuracy and the F1-measure The Results show that the accuracy of classification using Vader lexicon were higher for positive and negative sentiments.
|