Eduction Sentiment Grammar Files

Eduction Sentiment Analysis allows you find whether text has positive, negative, or neutral sentiment. For example, you can use it to determine whether users of a particular product or service are satisfied or not, based on an automated analysis of reviews.

The following table lists the languages that support sentiment analysis, and lists the name of the standard sentiment grammar and the user modification file. Each of these languages also support component extraction and user modification.

Language Sentiment Grammar User Modification File
Arabic sentiment_ara.ecr sentiment_user_ara.xml
Chinese sentiment_chi.ecr sentiment_user_chi.xml
Czech sentiment_cze.ecr sentiment_user_cze.xml
Dutch sentiment_dut.ecr sentiment_user_dutch.xml
English sentiment_eng.ecr sentiment_user_eng.xml
French sentiment_fre.ecr sentiment_user_fre.xml
German sentiment_ger.ecr sentiment_user_ger.xml
Italian sentiment_ita.ecr sentiment_user_ita.xml
Polish sentiment_pol.ecr sentiment_user_pol.xml
Portuguese sentiment_por.ecr sentiment_user_por.xml
Russian sentiment_rus.ecr sentiment_user_rus.xml
Spanish sentiment_spa.ecr sentiment_user_spa.xml
Turkish sentiment_tur.ecr sentiment_user_tur.xml

The Sentiment Analysis Grammars

Eduction matches input data to patterns defined with regular expressions (grammars).

The sentiment analysis grammar first defines dictionaries with the parts of speech. There are different dictionaries for positive and negative words, and other categories that describe different effects on the sentiment, where appropriate.

These dictionaries combine to form simple phrases that convey positive or negative sentiments. Finally, these phrases are padded, usually with other phrases, to form various patterns for the final entities, which match strings from the text that express positive or negative sentiment.

The grammar files are designed to be used out of the box. You just need to load the appropriate grammar file, and optionally choose the entities (usually positive or negative) to match with.

The sentiment grammar files have lite versions. The lite versions are identical to the full versions in most respects, but they do not support components or user modification. They can process data up to twice as fast as the full versions, depending on language.

OpenText recommends that you use the lite versions except when you need to use components or modify the built-in dictionaries.

The lite grammars have the same name as the full version, with _lite after the language. For example, the file name of the Chinese sentiment grammar file is sentiment_chi.ecr, and the file name of the lite version is sentiment_chi_lite.ecr.

Extend the Sentiment Analysis Grammar

The grammar files generally contain sufficient information to work with a wide range of data, from formal reports to user reviews and social media feeds. However, the recall (the percentage of matches that are actually returned, out of the total number of matches that should return in theory) can be low for some input data. Also, some examples might convey a different sentiment depending on your viewpoint. For example:

The phrase Company A is much better than Company B might convey a positive or negative sentiment depending on whether you are with Company A or Company B.

In these situations, you can improve the recall or adjust the sentiment analysis by extending the grammar.

You can extend the grammar by adding to the appropriate dictionaries in the sentiment grammar file. For example, if you are on the side of Company A, you can add Company A to the positive list.

NOTE: There are slight variations in the grammar files of different languages, so this does not apply to all languages.

For more information, see Extend Grammars.