Text Analytics involves the examination of unstructured and semi-structured text data to extract meaningful insights that inform decision-making and enhance customer satisfaction. By discerning hidden patterns, sentiments, and correlations within extensive textual data sets, Text Analytics empowers businesses to make well-informed choices.
In Do You Know, Text Analytics employs a blend of Natural Language Processing (NLP) techniques and Machine Learning to derive insights. The resulting analyses can be visualized through automatically generated charts in the Do You Know interface.
Four distinct types of analysis are supported within the Text Analytics module to suit diverse user needs:
- N-grams Analysis: This feature identifies the most prevalent words within the text, offering a broad overview. It also examines the frequency of sequences of words, such as bi-grams (two consecutive words) or trigrams (three consecutive words).
- Sentiment Analysis: Determining the sentiments of the text data—whether positive, neutral, or negative—is the primary aim of this analysis. Each text item is tagged with its respective sentiment.
- Topic Modeling: Utilizing sophisticated Natural Language Processing, this technique automatically groups words and similar expressions that best define sets of documents. It uncovers various topics or ideas within single or multiple text documents.
- Text Classification: This feature facilitates the development of machine learning models using text data. Subsequently, these models can be leveraged to make predictions based on new textual inputs.
Steps using Text Analytics:
Step 1: After accessing “Do You Know”, select “Classification”.
Step 2: Now, click on “Create New Insight-Trend”.
Step 3: Choose a Schema and proceed by clicking on “Next.”
Note: The schema signifies the dataset for analysis. If absent, create one, ensuring prerequisites (A KPI, Date, and Attribute) are met.
Step 4: Select the following:
- Analysis: Allows users to define the desired text analysis type. Let’s opt for “N-gram Analysis.” Next, in the second drop-down menu, selecting “Uni-gram” will display the most common single words. Choosing “Bi-gram” will present the most common pairs of consecutive words, while “Tri-gram” will exhibit the most frequent triplets of consecutive words, and so forth.
- Select Group: Next, users need to choose the grouping criteria, which can be a categorical column in the dataset holding a few distinct values.
When selected, the N-gram analysis will be conducted for each category or value within the chosen column. If this field remains unspecified, the N-gram analysis will be carried out on the complete text dataset.
For the time being, let’s leave this field blank.
- Select unique identifier(s): The chosen variable(s) will not be utilized for text analysis; rather, they will serve to uniquely identify rows within the dataset.
- Select input attribute: This input holds utmost significance as the column selected here will contain the text data utilized for the analysis process.
- Do you want to add filters?: Optionally add filters.
Step 5: Users can tailor the insights narrative, outlining all the variables utilized in crafting the insight. Then, click on “Save”.
Step 6: Name the insight for future access (default suggestion provided) and save it.
Step 7: A new window appears; click “Execute Now” to generate insights.
Upon executing the configuration, a horizontal bar chart will display the top five most frequent bi-grams (two consecutive words) present within the dataset.
Steps to Perform Sentiment Analysis
Step 1: After choosing the schema, select sentimental analysis and fill in all the required details. Click “Next” to proceed further.
Step 2: Users can tailor the insights narrative, outlining all the variables utilized in crafting the insight. Then, click on “Save”.
Step 3: Name the insight for future access (default suggestion provided) and save it.
Step 4: A new window appears; click “Execute Now” to generate insights.
After execution of the configuration, a bar chart will appear with bars representing the count of records or text items carrying positive, neutral, and negative sentiments.