User Guide
×
Menu
Index
 

Verbatims - Text Analysis

 
Text Analysis in Halo Reports is powered by R.
 
There are two types of text analysis that can be used with verbatim questions:
 
 
All text analysis options are accessible in the verbatim question menu.
 
There are 3 word options and 2 sentiment options.
 
 
After selecting one of the text analysis options, Halo Reports will automatically create a User-Defined Question (UDQ) that consists of the selected type of text analysis.
The UDQ will appear in the same section as the verbatim question that it was created from.
 
With Sentiment Category analysis, a category question is created that consists of 3 responses:
 
In the example above, a sentiment category count is shown for each make.
 
With Sentiment Score analysis, a numeric question is created that will run as a mean.
This will provide the sentiment score as a numeric value.
 
In the example above, a sentiment score is shown for each make.
 
Interpreting a Sentiment Score:
It is rare for the sentiment score to equal 0.
A score that falls between -0.05 and 0.05 can be considered neutral.
Sentiment scores below -0.05 are considered negative.
Sentiment scores above 0.05 are considered positive.
 
 
Text Analysis using Extract Words
 
Performs a series of text processing before a word count is performed.
 
The stages are:
 
 
Text Analysis using Extract Words (Converted into Common Core)
 
When the “Exact Words (Converted into Common Core)” option is used, the same text processing is performed as described in the Exact Words option, but in addition, the suffixes or endings to the words are removed.
 
As a result, the words are reduced into their “common core” form.
 
Common Core” is mTab Halo’s own name/terminology that is being used in place of the technical term, ‘stemming.’
 
Stemming is the process of reducing words to their stem, base, or root form.
A stemming algorithm might reduce the words fishing, fished, and fisher to the stem fish.
The reduced stem is not required to be an actual word.
For example, during the stemming process, argue, argued, argues, arguing, may be reduced to the stem, argu.
 
The reason that the term “Common Core” is being used instead of stemming is because not all the users may be familiar with the technical term of stemming, so it was decided to use more user-friendly terminology.
 
 
Text Analysis using Extract Words (Converted into Dictionary Form)
 
Dictionary Form” is mTab Halo’s own name/terminology for the technical term, ‘lemmatization.’
 
With this option, words are converted into their dictionary forms (or lemma), so they can be grouped and analyzed as a single item.
 
Using Common Core and/or Dictionary Form
 
There is always a trade-off when using stemming (Common Core) and lemmatization (Dictionary Form).
With Common Core, the grouping of different words may be more effective in terms of the final number of items, but the output/stems may not always be actual words (e.g., “famil” is a stem for “family” and “families”).
 
With lemmatization, the possibility of getting proper, semantically meaningful groupings of words is higher, but the final number of words may be larger.
Using the Dictionary Form option versus the Common Core option often results in a longer list of words in the Halo Reports output.