Applications as you want

Applications as you want

Applications as you want

 

In recent years the development and the popularity of multitask NLP benchmarks, such as GLUE or SuperGLUE, has canalized research towards a set of predefined tasks such as Sentiment Analysis, Machine Translation, Question Answering, Textual Entailment etc. While this is beneficial in the field of NLP research, in order to compare the output of different systems and assess progresses in the state of the art, it hides the fact that real life language based challenges are rarely comparable to these standardized tasks. In our more than decennial activity in NLP, it never happened that a usable dataset was readily available, be it, for instance, in  Suggestion Mining, Political Debates Analysis  or even Social Media Listening. This individuality and uniqueness of each industrial NLP application represents a real challenge: underestimating it and demoting it to some “comparable” task, will dramatically sacrifice the quality.

Our cognitive approach to semantics allows us to capture the peculiarity of each NLP application without devoting time and money to the creation of a specific dataset. By generalizing on abstract language structures and exploiting domain specific lexical semantics (mostly automatically learned) few days or weeks are usually enough to deliver a specific, goal-oriented system able to emulate the capacities of a human reader.

Deep Learning and Applications

Even by using pretrained language models such as the ones derived from the BERT family (see the article "10 Leading Language Models For NLP In 2021" for a fast introduction), plugging a deep learning model in a real life application is always problematic because of the lack of application specific and domain specific datasets. On the other hand, what can be automatically learned without any specific training (i.e. without any annotation), are domain-specific lexical semantics properties. Our approach in application parametrization consists indeed in creating unannotated corpora and using DL techniques to understand properties of words in the specific context. Without any configuration cost, our systems (ASI in the specific case) learns lexical properties both of a paradigmatic and syntagmatic nature. These range from semantic similarity (a verb such a “seize” has a legal sense in an internal security context (“The customs officer seized the undeclared goods.”) but a different meaning in a military context (“The invaders seized the castle after a tough battle.”)) to more complex phenomena such as role attribution and inference (think about kill in informatics and military domain).

 

And the Human?

With domain specific lexical semantics and abstract general semantic structures, the customization of a NLP application to answer new requirements becomes an easy and time effective task. This application layer is represented by powerful context-aware rules conforming to a classical IF-THEN format. Their power is represented by the high flexibility of the IF part, with the possibility of setting constraints on any aspect of the input linguistic structures.

Currently our application rule layer is implemented as a parametrized Drools Knowledge Base. This allows both to obey most common industry standards and facilitate further modification, thanks to the use of a well known rule language.