Perform Sentiment Analysis on Short Comments

The standard sentiment analysis grammars are designed for high precision. For some sources of short comment data, such as YouTube comments, no positive or negative matches are found in some documents despite sentiment clearly being expressed.

If recall with the full sentiment_eng.ecr grammar file is too low, and your documents are generally short comments, use sentiment_basic_eng.ecr to extract additional matches. This grammar contains carefully-selected lists of positive and negative terms that help determine the sentiment of a document in which sentiment_eng.ecr found no matches.

sentiment_basic_eng.ecr contains terms in title case, but research shows that for most data these impair recall, so these are given a lower score. OpenText recommends that you set EntityMinScoreN to 0.4 to filter out these terms unless you need them.

sentiment_basic_eng.ecr does not expose TOPIC or SENTIMENT components, and does not use scores to reflect strength or reliability of polarity. The following additional example configuration shows the recommended usage:

[Eduction]
ResourceFiles=grammars/sentiment_eng.ecr,grammars/sentiment_basic_eng.ecr
// optional further layer of analysis for very short documents:
Entity2=sentiment/basic_positive/eng
Entity3=sentiment/basic_negative/eng
EntityField2=BASIC_POSITIVE_VIBE
EntityField3=BASIC_NEGATIVE_VIBE
// remove this setting to include basic matches in titlecase - this is not recommended because on most data it decreases precision:
EntityMinScore2=0.4
EntityMinScore3=0.4