Eduction Sentiment Grammar Files

Sentiment Analysis grammars are available in the following languages:

  • Arabic
  • Chinese
  • Czech
  • Dutch
  • English
  • French
  • German
  • Italian
  • Portuguese
  • Russian
  • Spanish
  • Turkish

Each of these languages also supports component extraction and user modification.

The Sentiment Analysis Grammars

Eduction matches input data to patterns defined with regular expressions (Grammars).

The sentiment analysis grammar first defines dictionaries with the parts of speech. There are different dictionaries for positive and negative words, and other categories that describe different effects on the sentiment, where appropriate.

These dictionaries are combined to form simple phrases that convey positive or negative sentiments. Finally, these phrases are padded, usually with other phrases, to form various patterns for the final entities, which match strings from the text that express positive or negative sentiment.

The grammar files are designed to be used out of the box. You just need to load the appropriate grammar file, and optionally choose the entities (usually positive or negative) to match with.

Extend the Sentiment Analysis Grammar

The grammar files generally contain sufficient information to work with a wide range of data, from formal reports to user reviews and social media feeds. However, the recall (the percentage of matches that are actually returned, out of the total number of matches that should return in theory) can be low for some input data. Also, some examples might convey a different sentiment depending on your viewpoint.

The phrase “Company A is much better than Company B” might convey a positive or negative sentiment depending on whether you are with Company A or Company B.

In these situations, you can improve the recall or adjust the sentiment analysis by extending the grammar.

You can extend the grammar by adding to the appropriate dictionaries in the sentiment grammar file. For example, if you are on the side of Company A, you can add Company A to the positive list (for some of the languages).

NOTE: There are slight variations in the grammar files of different languages, so this does not apply to all languages.