Monday, February 25, 2019

Emotions 10000001


When I started out looking around for a sentiment analysis package in R, I was at once impressed by sentimentr. According to its creator Tyler Rinker:
sentimentr is a response to my own needs with sentiment detection that were not addressed by the current R tools. My own polarity function in the qdap package is slower on larger data sets. It is a dictionary lookup approach that tries to incorporate weighting for valence shifters (negation and amplifiers/deamplifiers). Matthew Jockers created the syuzhet package that utilizes dictionary lookups for the Bing, NRC, and Afinn methods as well as a custom dictionary. He also utilizes a wrapper for the Stanford coreNLP which uses much more sophisticated analysis. Jocker’s dictionary methods are fast but are more prone to error in the case of valence shifters. …
And he went on to explain what valence shifters are and why they are important. He gave an example:
mytext <- c(
    'do you like it?  But I hate really bad dogs',
    'I am the best friend.',
    'Do you really like it?  I\'m not a fan'
)
mytext <- get_sentences(mytext)
sentiment(mytext)
##    element_id sentence_id word_count  sentiment
## 1:          1           1          4  0.2500000
## 2:          1           2          6 -1.8677359
## 3:          2           1          5  0.5813777
## 4:          3           1          5  0.4024922
## 5:          3           2          4  0.0000000
Inspired, I also devised some sentences to see how sentiment analysis works:
Sentiment analysis is for the fools.
Sentiment analysis is not for dummies.
Sentiment analysis is not for the masses.
Sentiment analysis is mediocre science.
Sentiment analysis is not the right tool.
They were saved to a text file: “sentiSen.txt” Actually, I tested a number of sentiment analysis packages on them before I tried the emotion detection software reported in my last post.
txt <- readLines("sentiSen.txt")
# using sentimentr package
library(sentimentr)
s <- get_sentences(txt)
x <- sentiment(s)
Here I am learining to improve the look of my posts as well. In the last post, I was just using the default format of results (tables) given by Rmarkdown and tried to change font size in Blogger compose page. The results were disatrous as seen by missing columns and so on. I discovered that too late to do anything. Here I tried “xtable” as well as “stargazer” packages, to format tables, but finally I settled down to the “kableExtra” described here.
library(knitr)
library(kableExtra)
Here is the result from the sentimentr package.To better understand the formatting options of the kableExtra package, you may like to consult the document mentioned above.
kable(x) %>%
  kable_styling(bootstrap_options = c("striped", "hover", 
  "condensed"), full_width=F,position = "left")
element_idsentence_idword_countsentiment
116-0.2041241
2160.0000000
3170.0000000
415-0.3354102
517-0.3023716
It is really nice to be able to add scrolling for a wide table.
# using sentimentAnalysis package
library(SentimentAnalysis)
y <- analyzeSentiment(txt)
kable(y) %>%
  kable_styling(bootstrap_options = c("striped", "hover", 
  "condensed"),font_size = 11) %>%
     scroll_box(width = "600px")
WordCountSentimentGINegativityGIPositivityGISentimentHENegativityHEPositivityHESentimentLMNegativityLMPositivityLMRatioUncertaintyLMSentimentQDAPNegativityQDAPPositivityQDAP
3-0.33333330.33333330.000000000-0.33333330.33333330.00
30.00000000.00000000.0000000000.00000000.00000000.00
30.00000000.00000000.0000000000.00000000.00000000.00
4-0.25000000.25000000.000000000-0.25000000.25000000.00
40.25000000.00000000.2500000000.25000000.00000000.25
library(syuzhet)
nrc_data <- get_nrc_sentiment(txt)
kable(nrc_data) %>%
  kable_styling(bootstrap_options = c("striped", "hover", 
  "condensed"), font_size = 11, full_width = F,position = "left")
angeranticipationdisgustfearjoysadnesssurprisetrustnegativepositive
0000000000
0000000000
0000000000
0000000010
0000000000
library(coreNLP)
libLoc <- "C:/Users/MTNN/Documents/R/win-library/3.5/
           coreNLP/extdata/stanford-corenlp-full-2015-12-09"
initCoreNLP(libLoc)
Get sentiments with coreNLP:
atxt <- annotateString(txt)
kable(getSentiment(atxt)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width=F, position = "left")
idsentimentValuesentiment
1NANA
2NANA
3NANA
4NANA
5NANA
View the annotation of the text by coreNLP:
print(atxt)
A CoreNLP Annotation: num. sentences: 5 num. tokens: 36
z <- getToken(atxt)
kable(z) %>%
  kable_styling(bootstrap_options = c("striped", "hover", 
                  "condensed"), font_size = 11) %>%
  scroll_box(height = "230px")
sentenceidtokenlemmaCharacterOffsetBeginCharacterOffsetEndPOSNERSpeaker
11Sentimentsentiment09NNOPER0
12analysisanalysis1018NNOPER0
13isbe1921VBZOPER0
14forfor2225INOPER0
15thethe2629DTOPER0
16foolsfool3035NNSOPER0
17..3536.OPER0
21Sentimentsentiment3746NNOPER0
22analysisanalysis4755NNOPER0
23isbe5658VBZOPER0
24notnot5962RBOPER0
25forfor6366INOPER0
26dummiesdummy6774NNSOPER0
27..7475.OPER0
31Sentimentsentiment7685NNOPER0
32analysisanalysis8694NNOPER0
33isbe9597VBZOPER0
34notnot98101RBOPER0
35forfor102105INOPER0
36thethe106109DTOPER0
37massesmass110116NNSOPER0
38..116117.OPER0
41Sentimentsentiment118127NNOPER0
42analysisanalysis128136NNOPER0
43isbe137139VBZOPER0
44mediocremediocre140148JJOPER0
45sciencescience149156NNOPER0
46..156157.OPER0
51Sentimentsentiment158167NNOPER0
52analysisanalysis168176NNOPER0
53isbe177179VBZOPER0
54notnot180183RBOPER0
55thethe184187DTOPER0
56rightright188193JJOPER0
57tooltool194198NNOPER0
58..198199.OPER0
I found the results interesting, but quite surprising. I made all my five sentences to give negative sentiment to “Sentiment analysis”. Consider two of my sentences:
Sentiment analysis is not for dummies.
Sentiment analysis is not for the masses.
I am for the dummies and the masses, because I identify myself with them. Therefore I would condemn Sentiment Anaysis if sentiment analysis were not for the dummies or the masses. That was the sense I intended. But if it were the Sentiment Analysis guru, who is uttering these two sentences, he would in fact be expressing positive sentiment for his favorite subject, am I correct? Well, I am enjoying all this.

No comments:

Post a Comment