What is Sentiment Analysis? Definition, Types, Algorithms

10 min readMay 5, 2020

Every person has some kind of attitude towards things he experiences. We can like this “handwritten notes” feature in the smartphone but can’t stand the whole noise meter shebang. And there is also this “facelock” thing that really puzzles us. It is a natural thing…

All this says something about an object in question. And since this thing can be used by many people — there are dozens of such opinions from many people. When combined all these opinions paint a distinct picture of how the particular product is perceived.

That’s sentiment analysis in a nutshell.

In this article, we will look at what is sentiment analysis and how it can be used for the benefit of your company.

What is the Sentiment Analysis? / Sentiment Analysis Definition

The term “sentiment analysis” refers to a field of Natural Language Processing dedicated to the exploration of the subjective opinions expressed in different information sources regarding a specific subject.

In more strict business terms, it can be summarized as:

Sentiment Analysis is a set of tools to identify and extract opinions and use them for the benefit of the business operation

Such algorithms dig deep into the text and find the stuff that points out at the attitude towards the product in general or its specific element.

In other words, opinion mining and sentiment analysis is an opportunity to explore the mindset of the audience members and study the state of the product from the opposite point of view. This makes sentiment analysis great tool for:

expanded product analytics,
market research,
reputation management
precision targeting.
marketing analysis,
public relations,
product reviews,
net promoter scoring,
product feedback,
customer service.

How does Sentiment Analysis work?

Sentiment analysis is a predominantly classification algorithm aimed at finding an opinionated point of view and its disposition and highlighting the information of particular interest in the process.

What is an opinion in sentiment analysis? You all know the general definition of opinion: “a view or judgment formed about something, not necessarily based on fact or knowledge.”

Well, from the data science standpoint, an opinion is much more than this:

On one hand, it is a unique subjective assessment of something based on personal empirical experience. It is partially rooted in objective facts and partially ruled by subjective emotions.
On the other hand, an opinion can be interpreted as a sort of dimension in the data regarding a certain subject. It is a set of signifiers that in combination present a point of view i.e. dimension for the particular subject. Think about it as if it was one of the rings of Saturn.

With that in mind, Sentiment Analysis is applied for the following operations:

Find and extract the opinionated data (AKA sentiment data) on a specific platform (customer support, reviews, etc)
Determine its polarity (positive or negative)
Define the subject matter (what is being talked about in general and specifically)
Identify the opinion holder (on its own and in correlation with the existing audience segments)

Depending on the purpose, sentiment analysis algorithm can be used at the following scopes:

Document-level — for the entire text.
Sentence level — obtains the sentiment of a single sentence.
Sub-sentence level — obtains the sentiment of sub-expressions within a sentence.

Given its subjective matter, mining an opinion is a rather tricky affair. Opinions differ. Some are more valuable than other. There are four subcategories that further characterize an opinion.

Direct opinion — the one that directly states something. For example, “the responsiveness of the buttons in application X is poor”. Here you have a legit point.
Comparative Opinion — the one where X is compared with Y on certain criteria. For example, “the responsiveness of the button in application X is worse than in application Y”. In addition to being an insight into your product, it also serves as micro competitive research.
Explicit opinion — the one where everything is clearly defined. For example, “this chair is rocking”.
Implicit Opinions are trickier. In this case, an opinion is implied but not clearly stated. For example, “the app started lagging in two days”. It is important to note that implicit opinions may also have idioms and metaphors which complicates the sentiment analysis process.

Why Sentiment Analysis Matters?

Sentiment Analysis deals with the perception of the product and understanding of the market through the lens of sentiment data.

The thing is — there are many sources of public and private information out of which you can harness an insight into the customer’s perception of the product and general market situation. To name a few:

Customer support correspondence (regarding your product)
User-generated Product reviews
Professional product reviews (as in The Verge or Wired)
Social Media tractions
General and special-purpose forums

Customer Sentiment Analysis can help make sense out of these hoards of data and transform it into:

a clearly defined view on what certain segments of the customers think about the product or in general
A deep dive into the state of the market from the consumer’s standpoint.

In both cases, it is an influential factor in formulating and elaborating the value proposition for a specific audience segment.

Let’s go back to the beginning of the section and take a closer look at how helps with understanding the market and understanding of the product:

As one of the key performance indicators — the right kind of perception is strategically important for the further evolution of the product. Oftentimes, sentiment tracking is a decisive factor in choosing the direction of the marketing efforts and business development and it is crucial to know for sure what the score is. Sentiment analysis marketing gives you an opportunity to pinpoint the strong and weak points of the product from the consumer’s point of view.
In the case of market research, the role of sentiment analysis is less integral but influential nonetheless. It gives another perspective, adds additional colors to the picture of the market, and lets you look at the situation from the ground level. And this lets you find one or two untapped leeways that will help to find a niche and establish the product on the market.

While on the initials stages these activities are relatively easy to handle with basic solutions — at some point, it starts to make sense to use more elaborate tools and extract more sophisticated insights.

Types of Sentiment Analysis

In order to understand how to apply sentiment analysis in the context of your business operation — you need to understand its different types.

In this section, we will look at the major types of sentiment analysis.

Fine-grained Sentiment Analysis involves determining the polarity of the opinion. It can be a simple binary positive/negative sentiment differentiation. This type can also go into the higher specifications (for example, very positive, positive, neutral, negative, very negative), depending on the use case (for example, as in five-star Amazon reviews).
Emotion detection is used to identify signs of specific emotional states presented in the text. Usually, there is a combination of lexicons and machine learning algorithms that determine what is what and why.
Aspect-based sentiment analysis goes deeper. Its purpose is to identify an opinion regarding a specific element of the product. For example, the brightness of the flashlight in the smartphone. The aspect-based analysis is commonly used in product analytics to keep an eye on how the product is perceived and what are the strong and weak points from the customer point of view.
Intent Analysis is all about the action. Its purpose is to determine what kind of intention is expressed in the message. It is commonly used in customer support systems to streamline the workflow.

Sentiment Analysis Algorithms explained

There are two major Sentiment Analysis methods. Let’s look at both.

Rule-based approach

Rule-based sentiment analysis is based on an algorithm with a clearly defined description of an opinion to identify. Includes identify subjectivity, polarity, or the subject of opinion.

The rule-based approach involves basic Natural Language Processing routine. It involves the following operations with the text corpus:

Stemming
Tokenization
Part of speech tagging
Parsing
Lexicon analysis (depending on the relevant context)

Here’s how it works:

There are two lists of words. One of them includes only the positive ones, the other includes negatives.
The algorithm goes through the text, finds the words that match the criteria.
After that, the algorithm calculates which type of words is more prevalent in the text. If there are more positive words, then the text is deemed to have a positive polarity.

The thing with rule-based algorithms is that while it delivers some sort of results — it lacks flexibility and precision that would make them truly usable. For instance, a rule-based approach doesn’t take the context into account. However, it can be used for general purposes of determining the tone of the messages, which may come in handy for customer support.

These days, rule-based sentiment analysis is commonly used to lay a groundwork for the subsequent implementation and training of the machine learning solution.

Speaking of which…

Automatic Sentiment Analysis

While rule-based approach is more of a toy than a real tool — automatic sentiment analysis is the real deal. It is the one approach that truly digs into the text and delivers the goods. Instead of clearly defined rules — this type of sentiment analysis uses machine learning to figure out the gist of the message.

Because of that — precision and accuracy of the operation drastically increase and you can process the information on numerous criteria without getting too complicated.

In essence, the automatic approach involves supervised machine learning classification algorithms. In fact, sentiment analysis is one of the more sophisticated examples of how to use classification to maximum effect. In addition to that, unsupervised machine learning algorithms are used to explore data.

Overall, Sentiment analysis may involve the following types of classification algorithms:

Linear Regression
Naive Bayes
Support Vector Machines
RNN derivatives LSTM and GRU.

Sentiment Analysis Challenges

If there is one thing for sure, it is that sentiments are tricky beasts.

On the surface, it seems like a routine extraction of particular insight. But in reality, the sentiment extraction requires a bit of heavy lifting in order to really get the gist of it.

In this section, we will discuss the most common challenges that occur during the sentiment analysis operation.

Context and Polarity definition

Context is the thing that often stings perfectly fine sentiment mining operation right in the eye. While a human being is able to get the context without much of an effort — things are very different from the algorithm’s perspective.

The thing is — Algorithms can’t guess what they need to do in order to get the right results. They need to be configured to get the right results.

Because of that, the sentiment analysis model must contain an additional component that would tackle the context of the message.

The key is in the text vectorization that maps out the connections of the words in the text and their relations to each other in terms of parts of speech.

This gives an additional dimension to the text sentiment analysis and paves the wave for a proper understanding of the tone and mode of the message. Tools like word2vec and doc2vec can do this with ease.

Subjectivity and Tone determination

The identification of the tone of the message is one of the fundamental features of the sentiment analysis.

Overall, the tonality is relatively easy to calculate out of the message via the verbiage. Words like “nice” and “ugly” directly state the score.

The harder task is to determine whether the message is objective or subjective.

Here’s the thing. People tend to formulate the message in a variety of ways. Sometimes the message does not contain the explicit sentiment, sometimes the implicit sentiment is not what it seems.

The only solution for that is deeper and more varied verbiage in the NLP sentiment analysis model applied for the sentiment analysis.

You need to take into account various options regarding the characterization of the product and group them into relevant categories. This way, the algorithm would be able to correctly determine subjectivity and its correlation with the tone.

Irony and Sarcasm identification

Among all the things sentiment analysis algorithms have troubles with — determining an irony and sarcasm is probably the most meddlesome.

Why? Let’s call it The Treachery of Language.

You see — the way we use language is often subtly subversive. The words on their own might be a bunch of teddy bears, but the context they are used in can turn them into pink elephants on parade.

The algorithm does not get it. They make jokes and snarks at face value and classifies them as a moderately negative sentiment or an overwhelmingly positive one. Even more messy is the differentiation between irony and sarcasm.

The secret of successfully tackling this issue is in deep context analysis and diverse corpus used to train the NLP sentiment analysis model. There must be

Defining a Neutral Tone

Determining tonality can be hard enough due to contextual peculiarities and irony/sarcasm contamination.

And then there is a neutral tone.

What is a neutral tone? It is a type of tone that doesn’t contain any signifiers that can be classified as either positive or negative. Instead, a neutral message just states some facts.

How to deal with neutral messages? There are two ways.

First, you need to take a look at the context and see which facts are stated. That makes all the difference and takes the lid off the unexpressed opinion. But this approach is manual and can be applied in special cases only.

The second is for the algorithm itself. Neutral tone can be calculated out of what it is not i.e. polar message. Basically, you tag as neutral everything which cannot be identified as positive, negative, or its variations.

In Conclusion

Sentiment Analysis is one of those technologies, the usefulness of which wholly depends on the understanding of its capabilities.

It can be extremely useful if you know how to use it and it can be completely useless if you apply it to something it is not supposed to do.