What is sentiment analysis?

Posted on: May 24, 2022

Hand holding two white tiles with happy and sad faces drawn on

Sentiment analysis is a key tool for online companies to learn about their audience by understanding how people feel about their products or services. It is also sometimes known as opinion mining and combines elements of natural language processing (NLP), computational linguistics, and text analysis with deep learning.

This analysis of customer sentiment draws on subjective information from social media posts and comments, reviews, and survey responses. The thinking behind sentiment analysis is that emotion drives decision-making. By recognising certain patterns of positive or negative sentiment expressed by customers, companies can learn how to improve from customer experience to customer support, and ultimately, drive more sales.

How does sentiment analysis work?

Sentiment analysis primarily works through machine learning. By collating and analysing data that contains words which describe people’s feelings, a lexicon is created. Words like “impressed” are labelled as a positive sentiment, while words like “disappointed” are labelled as a negative sentiment. Sentiment can also be categorised as neutral. Words are allocated a sentiment score by bots trained on algorithms containing a set of rules for scoring. An online review or customer feedback can contain both negative and positive points so the score helps to give an overall sentiment for any given text.

Bots can also scour text on social media such as content in tweets or use sentiment analysis to understand how certain demographics are feeling by looking at discussions on particular forums. In a way, this application of data science is just another form of market research. People are used to sharing their opinions publicly about brands they love or if they feel they have received poor service, so social media monitoring offers businesses ongoing insights into people’s feelings about what they offer or on recent marketing campaigns, for example.

The machine learning models that power sentiment analysis tools can be created in Python using NLTK, the Natural Language Toolkit. The toolkit is free and open source, providing text processing libraries with the test datasets that programmers need in the application of sentiment analysis. NLTK allows a variety of tasks to be carried out such as parse tree visualisation and tokenisation. Parse tree visualisation helps define grammar, breaks down syntax, and creates a decision tree visualisation. Tokenisation breaks down text data into smaller units like words and numbers called tokens. These then create lists which become part of the vocabulary in a sentiment analysis system.

One of the most popular pre-trained models for NLP is BERT (Bidirectional Encoder Representations from Transformers). There are lots of tutorials available online which provide useful case studies on how brands use sentiment analysis, for example, building a sentiment classifier using the IMDB movie reviews dataset. Sentiment analysis can also be used on news articles to gauge the feeling of the general public around world events, such as the close-run Trump-Clinton 2016 election as digital agency The DataFace did.

How accurate is sentiment analysis?

Because subjectivity plays a big part in sentiment analysis, it is still difficult for machines to successfully understand customer sentiment. Even though there have been huge leaps in the science behind NLP, human language is complex and the nuances of sarcasm or slang, for instance, are difficult for neural networks to grasp in the way that the human brain can. Of course, even the human brain can fail to pick up on sarcasm or slang! As an example words or phrases like “sick” or “smashing it” in customer reviews could be construed by a bot as negative words rather than positive without understanding context or knowledge of slang.

Artificial intelligence relies upon accurate labelling of data in the supervised learning stage. However, humans can feel conflicting emotions at the same time and the meaning can be ambiguous leading to a “neutral” label which doesn’t offer much insight. Even with unambiguous labelling, when various words or phrasing are put together in a sentence, it can be as difficult for humans to parse as it is for machines. The polarity inherent in choosing between positive and negative labelling creates limits for machine learning techniques, which can become more accurate with fine-tuning, but the process remains complex.

When speaking, emotions are communicated through more than just words. There are also factors to consider such as intonation, facial expression, body language, posture, and biophysical signals like sweating or blushing which convey meaning. In the written word, punctuation, capitalisation, emojis, and other creative expressions can alter the meaning of a sentence. And that’s all in addition to someone’s choice of words, articulation, and grammar. Another consideration is the cognitive gap, which is when what people say or write does not correlate with what they actually think.

If and when AI can put together all of these pieces of information to accurately interpret them, we may well have reached the singularity.

Biometrics and data privacy

In the healthcare sector, there is increasing interest in how biometrics measured by wearables could be used alongside patient and customer feedback to gauge wellbeing and health. Wearable technology could then be upgraded to help patients to counteract the negative effects they may be experiencing.

MIT Media Lab has developed an app that can detect whether a person is experiencing emotions such as stress, pain, or frustration by monitoring their heartbeat. It then releases a scent to help the person slow their heart rate and feel better equipped to cope.

As iris recognition, fingerprints, and facial recognition are now more commonly used for account logins, these biometrics can be used for sentiment analysis, but there are of course implications regarding data privacy. Zoom recently introduced new features to analyse customer sentiment using conversation transcripts from sales or business meetings. While many of the most popular streaming platforms have used APIs for some time now to build profiles of users’ preferences for a better experience, there are other use cases similar to Zoom which are raising issues around privacy.

In fact, there is some debate over whether sentiment analysis is the same thing as emotion AI. Since the dawn of social media, sentiment analysis has been used in the categorisation of text in everything from product reviews to blogs. For some, sentiment analysis is solely about language and tone, whether written or spoken. Emotion AI relates more specifically to reading the face and facial expressions and with biometric authentication becoming more popular, this data is more available.

Nandita Sampath, a policy analyst with Consumer Reports has defined emotion recognition as AI that attempts to predict emotions in real-time based on someone’s faceprint and may refer to other biometric data, such as a person’s gait. With the potential for biometric information to be extracted via virtual assistants, cars, games, mobile phones, wearables, toys, virtual school software, and border controls, questions around privacy and ownership of information remain pertinent.

Find your niche with a master’s degree

Learn about analysing sentiments in this expanding field of data science with an online MSc Computer Science with Data Analytics. From text analytics to semantic knowledge bases, tech companies such as Amazon and Spotify have optimised customer sentiment monitoring to leverage their offering and become leaders in data capability.

Discover how you can expand your career options by specialising with a master’s from Keele University today.

What is sentiment analysis?

How does sentiment analysis work?

How accurate is sentiment analysis?

Biometrics and data privacy

Find your niche with a master’s degree

Quick Links