Sentiment Analysis 101

By the time you finish reading this sentence, roughly 3000 terabytes of new data will have been uploaded to the internet. Take a minute to reflect on that, and then consider that during your minute of reflection the number increased by another 18 000 terabytes.

The amount of data on the internet, even just on a single social media website like Facebook or Instagram, is simply unfathomable. Case and point: since the start of this paragraph, over 300 hours of YouTube videos have been uploaded. You get the idea. 

Data is the oil of the internet – everybody wants it and it pays the bills for many businesses. It is quite literally why credit card companies offer you rewards for spending through them. You don’t just get those extra airmiles or cashback for free. Credit card companies sell your data and then give you a small cut of what they make in the forms of these rewards programs. The real kickbacks ultimately go to their partners and investors.

If you didn’t know this and you’re feeling a little cheated, you should be. It is in fact why emerging alternatives such as the Brave browser are emphasizing control of this data you unknowingly give away for free or at a very small price. 

Privacy is of course another issue, and people are increasingly becoming more aware of how carelessly large social media platforms and service providers are handling their data. Sometimes it can be hard to criticize them though because there is so much of it to keep track of! It’s almost impossible to contain it all.

This overload has also made it impractical for these companies to analyze individual pieces of data. This is why they are instead turning methods and programs which extract the “main ideas” from a particular bundle of data, rather than meticulously combing through the relentlessly growing amount of specific pieces of information. One of the rapidly growing technologies which does this is known as sentiment analysis

What is sentiment analysis? 

Opinions about cryptocurrency
CryptoMood Sentiment Generator

Sentiment analysis is quite a broad field. Although it sounds technical (and certainly parts of it are), it is quite easy to understand. Sentiment analysis is the process of identifying whether a given text is communicating an opinion which is positive, negative, or neutral about any specific topic. This involves something called natural language processing.

There are many ways to analyze language and none of these approaches are perfect. However, this broader method of analyzing data is clearly preferable to meticulously picking through each little component. As technologies used in sentiment analysis are becoming more refined and efficient over time, its adoption is likewise becoming even more common across every industry which is heavily reliant on data, which is basically all of them. 

How does sentiment analysis work? 

Imagine someone came up to you and asked you to create something like sentiment analysis. After giving it some thought, you’d figure out that the best way to start would be to go through every single word in the English language and determine whether it is positive, negative, or neutral. Then, you’d simply add or subtract the score of each word in a sentence to get its overall sentiment. However, you’d soon realize that this isn’t good enough since there are other things like expressions (bad ass) or double negatives (can’t not) which give you some really funky results with this basic approach.

So, you start going through all of the common expressions you can think of and also all the short phrases which don’t jive with the simple word-to-word algorithm. You give them the same positive/negative/neutral categorization. Finally, you’d remember that the internet has a whole bunch of special characters like emojis which might also be important to consider, so you create a list of those too and give them each their appropriate sentiment score. You’d add and subtract all of these words, expressions, and emojis together and that will give you (for most sentiment analyses) a standardized score between +0.5 and -0.5, positive or negative sentiment. 

What has just been described is 3 different sets of “lexicons”, large datasets which are used by sentiment analysis APIs (application programming interfaces) so that they can accurately analyze a given text. They basically use these lexicons as a template. This is where the floodgates open and the critics come rushing out: “How can you trust that these templates are accurate enough? They can’t possibly account for every nuance. Only humans can do that.”

While critics certainly have a point, the power of these lexicons is that they provide a standardized point of reference for sentiment analysis technologies. There are in fact multiple popular lexicons which are widely used by sentiment analysis programs such as the Sentiment Strength Twitter Dataset and the Amazon Reviews for Sentiment Analysis. Although they may not be perfect or entirely accurate, at least you can trust that the results of the sentiment analysis tests you run will be consistent and universal, even between different sentiment analysis platforms. 

What is sentiment analysis used for? 

The use cases for sentiment analysis fall into two broad categories: voice of customer and voice of employee. As you might have guessed by their names, the first focuses around customers and the second concerns employees. Both consist of the same fundamental problem which is sorting through the large volume of feedback a company is receiving both externally from clientele and internally from its own workforce. Logically, as a company continues to grow, a way to effectively sift through the increasing amount of information they have to deal with on a daily basis becomes that much more valuable. 

In the context of customers, sentiment analysis can be used to evaluate how well a particular initiative is working. Suppose you are a large company and you start promoting a new product on your various social media channels. Obviously, it’s impossible for anyone to go through every single comment, like, and retweet you’re receiving on a daily basis about that new product. However, you’d certainly like to know whether that high volume of interaction is positive or negative. Using sentiment analysis technologies can help you get a sense of whether your new product is the next best thing or simply a viral joke. 

Similarly, when it comes to your own employees there is no shortage of things you would like to know. Many large companies tend to have disclaimers in their work contracts highlighting the fact that they can access things like your work computer and your work email. In some cases these things are quite literally considered their physical and intellectual property.

Although these clauses were initially in place to protect against things like unsubstantiated lawsuits, employee theft, and the theft of intellectual property, these same binding agreements help businesses be better employers to their employees. They can for example analyze the emails circulating between employees about a new manager or a modification in workflow process to get a sense of how their employees are really feeling about these changes. When it comes to large companies with tens of thousands of employees, HR departments can use sentiment analysis to shuffle complaints and feedback to appropriate personnel.

Cryptocurrency trading tool
CryptoMood Desktop Terminal

A third use case for sentiment analysis has recently emerged: finance. The effect sentiment analysis will have on things like investing will arguably be more profound than in any other context. When news about a certain asset is broadcast, it sways the opinions of traders who then rush to buy or sell that asset based on that information. Not only that, but price trends are sustained so long as the positive or negative sentiment of traders continues.

There is of course a bit of a feedback loop there, wherein the news can and often does amplify existing sentiments once a stark price trend has taken hold. Now, imagine you could theoretically analyze all reputable news sites about an asset you were interested in. Without having to read all of the articles individually, you would be able to know what the general opinion about that asset is from the perspective of these often-authoritative news sources. Then, once you detect a sudden spike or decrease in sentiment, you could position yourself accordingly and ride the social sentiment wave of traders.

If you graphed this, you could even have a sense of when that sentiment would peak or bottom, allowing you to theoretically buy the dip and sell the top. Applications such as CryptoMood are working hard to provide a sentiment analysis tool that does exactly that for cryptocurrency traders. 

The future of sentiment analysis

In the previous section about how sentiment analysis works, what we reviewed was something called a “rule-based” sentiment analysis. It is rule based because it works off of a very simple protocol: add or subtract the sentiment values of the words, expressions, and emoticons using the standardized lexicons for reference and then use the total score to judge the sentiment of that text.

There is however another approach to sentiment analysis which is becoming increasingly popular called AI-based sentiment analysis. Despite the obvious complexity of what’s going on underneath the hood in terms of code when it comes to AI-based sentiment analysis, there is one fundamental principle at play: machine learning. Instead of simply referring to a bunch of pre-existing libraries of text, machine learning involves feeding the AI samples of actual text and telling it what to look for in terms of sentiment from certain words or phrases. After enough trials with these “real” text samples, the AI can do a pretty good job of identifying sentiment in other contexts and these AI sentiment analyzers are only getting better by the day!

Another notable development in sentiment analysis is something called intent analysis. Similarly, to sentiment analysis, it identifies whether a given text belongs in a certain category. However, it is instead focused on identifying the intent of the text – is it a complaint, a question, a suggestion, or truly just an opinion? This sort of data is extremely valuable to marketers and even to the day to day operations of any business with a large volume of online interaction. It can for example make it easier for customer service representatives to address customer needs by automatically connecting them with the appropriate department based on the intent analysis of their message to the company. Intent analysis, in combination with sentiment analysis, will completely revolutionize the way businesses interact with consumers and even with their own employees. The sky is the limit as far as these technologies are concerned, and it is going to be very interesting to see what new developments and applications will emerge in the years to come!