_NLP, our area of expertise, employs artificial intelligence to simulate human ability to read and understand text.
SentiSquare’s NLP technology is based on Distributional Semantics. This approach enables to represent the meaning of text without any supervision. The principle goes, “You shall know a word by the company it keeps” (Firth 1957).
Essentially, words are presumed to have similar meanings if they occur in similar contexts. That opens an opportunity for the quantification of meaning: Textual expressions can be represented as vectors in high-dimensional semantic space encoding their distribution over contexts (that is where the title Distributional Semantics came from).
This way, our algorithm learns the meaning of any text. Excitingly, it means that SentiSquare AI is language-independent.
Once we have created such meaning representation, interpreting the meaning in the desired way is still needed.
Take the word "loud", for instance. If you see it in the context of "music festival", you can state it is meant positively. However, if you see it in the context of "noise", it will probably be negative. Still, "loud" means the same thing.
What to do? Supervision is required to distinguish such cases. That is why we add it to the mix and call the result "semi-supervised machine learning".
Our use of multiple machine learning methods is why our AI boasts exceptional accuracy. The secret sauce is knowing how to combine unsupervised and supervised learning in just the right way.
In the interpretation phase, we combine all patterns to interpret the meaning in a desired way. Millions of context-based rules are created for each of our models and tuned to its specific functions.
As our algorithms learn directly from our clients’ data, they become a perfect fit for their operational and business needs.
How does SentiSquare AI deliver superior results in real-life deployment?
These are the 3 key success factors.
In any task that concerns text classification, the rate at which the AI returns a correct answer is crucial to its operational impact. While AI accuracy can never reach 100%, SentiSquare AI is getting pretty close to human level.
Not all NLP models can deal with difficult, messy text data – and very few can adapt to different languages. Call transcripts, for instance, are tricky as they contain a lot of errors. Emails are messy since they contain fluff such as footers etc. SentiSquare AI can deal with any text in any language because it learns patterns directly from data.
For a classification model to be successful, we need to know what it should look for. Otherwise, it will dump a lot of incoming communication into the “other” category. But there is no way to create meaningful rules without clarity about the content of the data. In big data, it is nearly impossible to create a dictionary of rules that covers everything. SentiSquare AI does just that – on its own, through clustering. The result is a thorough understanding of topics and patterns in the data.