Keywords Based Temporal Sentiment Analysis

July 4, 2017 | Autor: Nishantha Medagoda | Categoria: Machine Learning, Sentiment Analysis, Text Mining
Share Embed


Descrição do Produto

Keywords Based Temporal Sentiment Analysis Nishantha Medagoda, Subana Shanmuganathan School of Computer and Mathematical Sciences Auckland University of Technology Auckland, New Zealand Abstract— Free texts such as comments by customers or readers, contain not only the sentiments of the topic being talked about but also temporal trends of the sentiments. Sentiment detection automatically estimates the polarity of the comments as positive, negative or sometimes neutral. On the other hand, a temporal sentiment analysis in an investigation of the sentiment pattern within a given time period. We propose a method for investigating the temporal patterns using keywords in the comments. We managed to relate a few major events that occurred during the time period of investigation (19 November – 20 December 2014) using sentiment classification techniques and keyword clustering. The results of this work show how temporal sediment analysis could be used to establish the changes in opinions from the pubic relating to issues-events in a historically important election campaign in a developing country. The results show interesting information on the change of opinions during this election campaign impossible to learn by other means. Keywords-Sentiment Analysis, Opinion Mining; Temporal Analysis;

I.

INTRODUCTION

A given opinion can be classified either as a positive (P) or negative (N) or neutral (O) one, depending upon the opinion’s polarity towards or against the theme of the topic being talked about. Some occasions, the opinion does not say anything about the topic being talked about; such neutral opinions can be considered as objective (or neutral) opinions and this procedure is known and subjectivity classification [1]. Despite the increased use of opinion mining and sentiment analysis for finding other views on products, services and even in politics in find out pubic interests, studies on opinion based temporal investigation are limited [2]. Temporal opinion mining is a relatively new field and it is popularly known as many terms, such as time-aware opinion mining, temporal sentiment analysis, opinion change mining and opinion tracking, and many other forms. Temporal opinion mining is similar to the traditional time series analysis where the process of studying and detecting possible changes in specific opinions and sentiments relating to theme/ event over given periods of time [3]. If the pattern continues over time, the forecasting of the sentiment intensity can be determined. But in the context of temporal sentiment analysis, the process involved is about the determination of sentiment change in a given topic over two or more unique time interval/s within the total time period being studied. It also involves the identification of the change in the opinion over the time to some interesting event to compare or

discuss the respective changes. Therefore, the benefit of temporal sentiment analysis becomes more apparent when combined with understanding why (the reason) the underlying change/s occurred. In view of the above facts, in this paper we look at how the public opinions changed over a period of time (19 November – 20 December 2014) prior to a historically important presidential election in Sri Lanka, a developing country that is still considered as a pivotal country in the South Asian region. II.

RELATED WORK

Opinion Mining and Sentiment analysis studies began in the early part of this decade but relatively very few works done in the topic of temporal variation in opinion mining. Nonetheless, currently the temporal aspect of sentiment analysis is the most attractive and challenging research field among the NLP community [1]. Temporal opinion mining adds the time awareness to the opinion mining techniques in general. In a study by Fukuhara et al [4], the authors produced two sets of graphs by analysing news articles written in Japanese. The first one named “topic graph” showed the trends in topics associated with a sentiment. Secondly, a “sentiment graph” was generated to show the trends of sentiments and their correlation to the topics. In both cases, they used sentiment phrases which were described to be patterns of sentiment expressions. The authors extracted 383 Japanese sentiment expressions and classified them into 8 categories. The topic graph was produced for dice coefficient against the time. The dice coefficient is the correlation between key words extracted from articles which contained sentiment phrases of a defined category. The graph of the sentiment category “happy” showed the clear peaks at different time intervals which were associated with topics studied in that work. A sentiment graph was generated for sum of all frequencies of all sentiment phrases against time. The graph associated with “earthquake” for the period of fourth quarter in 2004 was found to be correlated to bursts of “shock” and “anxiety” In another study by Andre et al [2], the authors discussed the distribution of positive and negative tweets during the period of FIFA’s Confederation cup which took places in Brazil in 2013. The tweets typed in Portuguese language were extracted using Twitter’s API. The extracted tweets were then classified into three categories using Naive Bayes classification. In that supervised classification, the training data set compiled manually also had emoticons included in the

tweets. The temporal patterns of number of positive and negative tweets graphically presented were discussed during the period of April 12th to August 12. The polarity of the positive comments were also presented graphically, but the method of calculating the polarity score was not explained in the paper. The clouds of the most frequent words for positive and negative polarity tweets described in both cases were found to have some unique words dominated in both clouds. The study “From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series” by [5] presented the results of a comparison of time series approach and independent polls. In that study, it was found that the final results of the surveys on consumer confidence and political opinion over 2008 and 2009 were correlated to the sentiment word frequencies of tweeter messages published in same time period. Initially messages containing the keywords (keywords were manually specified) relating to the topics that were retrieved from the tweeter. Then the sentiment score for a given time stamp (day) was calculated for the message with the aid of a subjective lexicon, as the ratio of positive versus negative messages on the topic. A message is defined as positive if it contained any positive word/s and negative if it contained any negative word/s. However, when performing a casual inspection using the calculated sentiment score authors found many examples of falsely misclassified comments. They explained this may have caused due to wrong matching of part of speech of some words. Also mentioned that the recall was very low as the disagreement of the well-defined words in the lexicon and the informal language used in the twitter messages. The sentiment ratio for key words was compared with different measures of consumer confidence. The graphical presentation showed that the sentiment ratio captured the trend in the survey data. The survey data was found to be reasonably correlated with sentiment score with a correlation coefficient 0.73. III.

TEXT DATA

Text data considered for this study comes from comments given by some online readers of news articles written in Sinhala on a popular newspaper called “Lankadeepa” (http://www.lankadeepa.lk/). The comments written for the articles were related to recent political changes. The time span for the study is approximately one month from 19th November to 20th December 2014. During this period when presidential election campaign was taking place, readers were more inclined to write their views on political topics than any other news. Several candidates contested for the presidential election but candidates from two major parties were in the forefront in the election campaign. Therefore, the articles selected on the political domain which had the highest number of comments were related to these two main candidates. Although the comments were made in different time of the day, for the purpose of analysis these comments were grouped into a 24 hour period in this temporal sentiment analysis.

(a)

(b) Figure 1. (a) Number of comments for each article selected (b) Number of comments during the period

These comments were collected using a web crawler targeting the cascading style sheet (CSS) selectors on the page which tag the comments. Using this method, 635 comments were extracted each comment with an average length of 16 words, ranging between 2 and 295. IV.

DATA PRE-PROCESSING:

A. Data Cleaning Pre-processing describes any type of processing on raw data before it is input into the main processing procedure. Commonly used pre-processing methods are; data transformation, noise removal and normalization. In this algorithm, the most required pre-processing task is cleaning the comments. During the cleaning process, tasks, such as removing the punctuation marks, correcting the spelling mistakes and removing gibberish, are completed. The punctuation marks do not carry any meaning in unstructured text, such as reader comments. Since most of the comments are texts, removal of these punctuation marks prior to the next analysis can be justified. The algorithm considers the bag of words as the key feature of the algorithm hence, the unique wordlist was created after applying the spelling correction process. The words without any meaning, the gibberish were removed next in pre-processing stage to complete the cleaning

of the comments process. In addition, the comments which were written in transliterated form were converted to the native language. B. Coding the comments into sentiment categorizes. With an aim of applying supervised classification techniques for analysing the comments and other investigations, the collected comments were classified in to three sentiment categories manually and the categories were; positive, negative and neutral. The classification was done manually by native speakers. If a comment expressed was in supportive of the news article then it was classified as positive and if the expression was opposed to the contents of the article, the comment was classified as a negative one. On the other hand, if the comment was in the form of an objective issue then it was coded as neutral. C. Eliminating Functional/Stop Words Function (grammatical) words are words which have little meaning but essential to maintain the grammatical relationships with other words [6]. Function words also known as stop words include prepositions, pronouns, auxiliary verbs, conjunctions, grammatical articles or particles. For a given language, the set of function words is closed and freely available. In text analysis, these words are dropped in order to reduce the dimension of the feature vector set. Besides, as these function words carry less importance to the meaning it is reasonable to remove them all. But in this research, the stop words relating to negation such as (no, not), (can’t) were not removed. The sense of these words affects the total meaning and the sentiment scale of the comments immensely. V.

EXPERIMENTS FOR UNDERSTANDING THE COMMENTS

The experiments conducted on the comments can be divided into two sections. Firstly, we investigated the data in general by understanding the temporal distribution of the comments and presenting the findings graphically. Secondly, more sophisticated text analysis techniques were carried out on the data set. This involved some machine learning methods mainly aimed at clustering and classification of the opinions.

important to the concept of the comment being talked about. On the other hand, a word with maximum frequency is highly correlated with the concept. It is also noted that it follows the Zip’s rule that 80% of the words are with less frequencies and 20% shows the highest frequency count [8].

17% 83%

Figure 2: Keywords versus their Frequencies

The list of keywords for further analysis was selected by removing the words with fewer occurrences. The threshold for this selection has been decided by considering the Zip’s law and the graph of keywords versus frequencies (figure 2). It was concluded to use the words with frequencies greater than 3 as the keywords for the rest of analysis. This selection contributes to 17% of the total keywords. B. Temporal Keyword distribution The keywords list selected for this work and the words distribution is presented in section 5.1. With this list of words, keyword frequencies with different timestamps were investigated to identify the temporal variations. In this case, keywords count was calculated in terms of count per comment when the comment was recorded. Next, the time is normalized to a day by grouping the comments in to 24 hour periods. The keyword frequencies in both cases are presented in the following graphs (figure 3 & 4).

A. Keyword Distribution Keywords are non-grammatical terms in a written document that explain the main concept of the text. In order to understand the set of documents or comments, it is essential to select the key words. Definition 1.Keyword selection: If D is a collection of documents and K is a set of keywords, D/K is the subset of documents in D that are labelled with all of the keywords in K [7]. The frequencies of all the words in comments after removing the stop words were calculated to generate the list of keywords. The minimum frequency of 1 indicates that the word is less

Figure 3: Keywords per comment

Based on figure 3, average number of keywords per comment is 17 and only 3 comments have 0 keywords. The majority of comments have keywords between 2-20. The distribution of keywords in a normalized time stamp is given in figure 4.

Figure 6: Keyword density and Sentiment on each day (N-Negative, P-Positive, O-Neutral)

Figure 4: Keywords per day

It clearly indicates the declining trends of different keyword densities through the period studied 19 November to 13 December 2014. C. Sentiment Distribution Next we further investigated the temporal patterns in the polarity; counts of the positive (P), negative (N) and neutral (O) of comments on each day.

With the idea in mind that, the quantity of the keywords in a comment does not affect its polarity, we continued the experiment to test how the quality/nature of the keywords correlate with polarity and the time the comment was posted. Initially, we investigated name entity searching, hence searched the quantity of names of the main candidates appeared in the comments. The figure shows the frequency of the two main names (Mahinda Rajapaksha or candidate 1 and Maithiripala Srisena or candidate 2) included in comments in each day. The pattern of the proper noun quantities i.e candidate names shows the similar patterns for both cases except the candidate 2 appeared more than the candidate 1 throughout the study period except for 8, 11 and 13 December.

Figure 7: Fraction of comments containing candidate name

Figure 5: Sentiment Distribution on each day

As one can see from the histograms showing the sentiment distribution between 19 Nov - 13 Dec 2013 (figure5), most of the days the positive (P) comments are higher than the other two (N and O). Next, we compared the number of different polarities per day (number of positive, negative and neutral comments) of the comments with keyword densities in this same period. The polarity is inversely proportional (approximately) to the keyword density (figure 6). i.e. On a given day when the keyword density is higher, then the frequency of positive, negative or neutral comments on that day is low and vice versa. By this pattern, it can be justifiable to conclude that polarity does not depend on the number of keywords in a comment whether it is positive, negative or neutral.

VI.

TEXT ANALYSIS

In this section, we carried out an in-depth analysis on the collected comments using advanced text analysis methodologies. The analysis was initiated by calculating the sentiment score using sentiment lexicon developed by the authors in an earlier study [9] using SentWordnet 3.0[10]. The bag of word approach was applied using the TF/IDF features to verify the classification accuracy. In the final step, the keyword clusters were identified by applying the standard clustering techniques, such as K-means and self-organizing map (SOM). A. Sentiment score investigation using a sentiment lexicon The sentiment score for each comment was calculated using a sentiment lexicon developed in [9] (Nishantha Medagoda, 2014). A sentiment lexicon usually contains a special set of words for that language with polarity scores either positive or negative. The polarity score also known as valance, is a scale used to determine the polarity strength of the word that is present in the comment. Hence, a comment can be categorized into positive, negative or objective by adding the polarity scores of the sentiment words in that opinion (a bunch of words). In this experiment, sentiment score for all the adjectives and adverbs in each comment is calculated using the sentiment score in the lexicon used. Here we considered only adjectives

and adverbs as the two are the most important language units (part of speech) when analyzing sentiments in any language [11]. The total (algebraic sum) score changes over the period of time concerned is shown in a trajectory in (figure 8). A significant drop of the total score monitored on 10th December where an event of crucial political change happened. This event is against candidate 2 and also observed to have the highest positive comments during the whole period. Further it is noted that on same day the keyword density also reached the second maximum (for the time period studied).

On 22nd December there were 92 opinions recorded and according to figure 5, the number of negative comments were more than other positive and neutral ones. Furthermore, it could be noted that the keyword density was maximum on this day. By this observation we can conclude that adjective positive score dominates the polarity of the negative comments. By studying the same behavior on 10th December, it is clear that the positive polarity is controlled by the negative adjective score. A similar investigation was done on Adverbs and the graphs of both positive and negative adverbs given in figure 10.

(a) Figure 8: Total sentiment score over the time period

The sentiment scores for the positive and negative adjectives were calculated and examined in detail using graphs. Both positive and negative graphs for 22nd November reached the highest and lowest scores observed over this study period (figure 9).

(b) Figure 10 (a): Total sentiment score for positive adverbs b): Total sentiment score for negative adverbs

The sum of the positive scores of the adverbs in negative comments are generally higher than the same of the positive and neutral (see figure 10). A similar pattern is shown on some of the negative sentiment scores (with absolute scores) for negative comments. (a)

(b) Figure 9 (a): Total sentiment score for positive adjectives b): Total sentiment score for negative adjectives

B. Classification using Bag of Word method The bag of Words method is the most popular classification/clustering algorithm in text mining [12]. In this method a set of keywords is selected as a bag of words. Then the feature vector is constructed using these words. The feature is the quantitative measure derived from the data. In this experiment tf/idf is used as the feature weighting. Where tf denotes the term frequency for the comments which simply refers to the number of times a given term appears in that comment. This value is normalized to avoid the bias in long comments and to give the exact importance of the word and it is calculated using the following equation,

Where nij is no. of times the term ti appears in the comment Cj and the denominator is the sum of all the words in the comment Cj.

Then hierarchical clusters generated for the full data set and the dendrogram for the 88 keywords is given in figure 13.

The inverse document frequency (idf) is a measure of the general importance of the term. idf is obtained by dividing the number of comments by the number of comments containing the term. Then the logarithm of the quotient is calculated as,

Where

total number of comments considered and

is the no. of comments where the term ti appears. The division-by-zero occurs when the term ti is not present in the comments. To avoid this, one can change the denominator to Then the

In this experiment word cluster was generated using the above explained bag of word method. Initially the keywords with frequency greater than 3 were selected as the list of words for the classification algorithm. This list contained 712 keywords and it is a significantly large list. Hence, the dimension of the keyword list was reduced prior to applying the clustering methods. As the adjectives and adverbs are the most crucial for sentiment analysis, we decided to reduce the scale of the keyword list by filtering to select adjectives and adverbs alone [11]. It was found that there were 88 adverbs and adjectives contained in this 635 comments. However, only 316 comments included at least an adjective or adverb. Then the tf/idf for each keyword in each comment was calculated. Initially clustering using the k-means algorithm was tested with the data set (bag of words vs normalized frequency) to establish the ideal number of clusters for the algorithm. To determine the number of clusters, within group sum of squares of different cluster sizes was plotted (figure 11).

Figure 12. Main three clusters

Keywords of the 316 comments were grouped into 4 distinct clusters (figure 12). Group two contains 38 keywords and most of them are related to the qualitative adjectives and adverbs. Similarly, in group three most of the words were found to be quantitative in nature and 39 keywords were word of negative adjective.

Future Oriented

Quantitative Qualitative Figure 13. Nature of the clusters

C. Temporal cluster Analysis The clustering process was continued with an aim of investigating the number of clusters for each day. The tf/idf values calculated (see section 6.1) for the comments and clusters for the Adjectives/Adverbs generated for each day were used in this clustering. The keyword clusters on day 1, 19th November 2014 shown in figure 15.

Figure 11: Within group sum of squares

There was no significant drop in the sum of squares (figure 12) and this indicates that the clusters are more hierarchical than distinct in nature.

Figure 14: Keyword cluster-Day1

An exceptional cluster is observed (figure 14) with 88 keywords in day 1 (W10, W51, W72, W81, W84 and W88). This cluster consists most of the negative words in the adjective and adverbs keywords list. Interestingly, majority of the comments on this day were related to the news on the new contester for the presidential election (candidate 2). The experimental clustering of the keywords for other days were also carried out in a similar manner. For day 3, in which words with positive quantities were grouped to make a cluster (figure15). Words represented in this cluster were W6, W7, W26, W27, W28, W59, W79, and W85. The contextual The contextual meaning of the words are; (new), (direction), (collection), (way), (vote), (large), (correct) and (clear). Except the words (direction) and (vote) all the others correlate to (vote) can be a noun and positive concepts. The word adjectives in this context.

(largely). The adjective (collection) also in the same group in the sense of positive polarity. •

The adjective (collection) also in the same group in the sense of positive polarity. VII. CONCLUSIONS.

The aim of this initial study was to investigate the temporal variations of free text comments written by some readers for particular news articles. The investigation was carried out using standard methods of text mining and sentiment classification, namely clustering and classification. From the result of this investigating as anticipated the number of readers and comments got lesser by the end of the time period studied. In parallel to this reduction, the keyword densities also showed a downward trend during the period. But the positive comments per day were high, the keyword density was found to be inversely proportional to the amount of sentiments, i.e number of positives, negatives and neutrals. From this observation, it can be concluded that the sentiment of a comment is independent of the number of keywords. From the calculation of the sentiment score for adjectives and adverbs, it was observed that the total sentient score was significantly decreased when an important event happened over the time period studied (day 6, 10th December). Hence, the observation is that maximum or minimum in sentiment score relates to an important event, Meanwhile, a reverse behavior, that positive adjective sentiment score in a comment determines the sentiment of negative comments and vice versa.

Figure 15: Keyword cluster-Day 3

Another two significant clusters were monitored in day 6, one was a grouping of positive sense keywords and the other was of negative adjectives. No significant grouping was observed in rest of the days within the period studied. Based on the above observations/interpretations on the SOM clustering performed on the complete comments data set, the following could be summarized: •





It could be stated that out of 635 comments only 316 comments included at least one adjective or adverb. In the clustering of the keywords on these 316 comments, different grouping of the adjectives and adverbs were formed. The keywords were clearly separated according to adjectives/adverbs and further separated based on the quality of the both adjective and adverbs. These qualities explain the polarity, comparative and qualitative forms of the adjectives and adverbs. As an example, the adjectives of positive polarity (real), (respectable or important) makes a cluster. Similarly another grouping found with positive adverbs (correctly),

The major difficulty of temporal studies on opinion mining is that the amount of comments or opinions made by the readers get reduced as time passes. As the news get older the amount of comments made by the readers dropped significantly. However, the events happening during the time period can be identified by the temporal studies on the opinions. Through temporal analysis such as the one explained in the paper, valuable and vital information on issues that matter to the general public can extracted from the opinions and how they change over time as well can be established. REFERENCES [1] [2]

[3]

[4]

[5]

Liu, B. (2010). Hand book of Natural Language Processing. CRC Press, Taylor and Francis Group. A. Alves, C. Baptista, A. Firmino, M. Oliveira, H. Figueiredo. (2014). Temporal Analysis of Sentiment in Tweets: A Case Study with FIFA Confederations Cup in Brazil. Database and Expert Systems Applications, 8644, 81-88. J. Yang and J. Leskovec. (2011). Pattern of Tempral Variation in Online Media. Fourth ACM international conference on Web search and data mining (pp. 177-186). New York: ACM. T. fukuhara, H Nakagawa, T Nishida. (2007). Understanding Sentiment of people from News articles: Temporal Sentiment Analysis of Social Events. International Conference on Weblogs and Social Media (ICWSM). Colorado. B. O'Connor, R. Balasubramanyan, B. Routledge, N. Smith. (2010). From Tweets to Polls: Linking Text Sentiment to Public OPinion Time Series. Proceeding of the International AAAI Conference on Weblogs and Social Media, (pp. 1- 8). Washinton DC.

[6]

[7] [8]

[9]

J. Zhang and H. Zhao. (2013). Improving Function Word Alignment with Frequency and Syntactic Information. Twenty-Third International Joint Conference on, (pp. 2211 - 2217). Beijing. R. Feldman and I. Dagan. (1998). Mining Text Using Keyword Distributions. Journal of Intelligent Information Systems, 10, 281-300. Piantadosi, S. (2014). Zipf’s word frequency law in natural language: A critical review and future directions. Psychonomic Bulletin & Review, 1112-1130. N. Medagoda, S. Shanmuganathan. (2014, November). A framework for Opinion Mining and Sentiment Classification for morphologically rich languages. Auckland, New Zealand “unpublished”.

[10] F.Sebastiani and A. Esuli. (2006). SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining. In Proceedings of the 5th Conference on Language Resources and Evaluation, (pp. 417 - 422). Genoa - Italy. [11] F. Benamara, C. Cesarano, D. Reforgiato. (2007). Sentiment Analysis: Adjectives and Adverbs are better that Adjectives Alone. International Conference on Weblogs and Social Media (ICWSM). Boulder, Colorado. [12] C. Manning and H. Schütze. (1999). Text Categorization. In Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.