Syntax as it is
Since Chomsky and the transformational grammar, linguists have formalized in a number of ways the structure of human language. These formalizations can be grouped under the name of formal cognitive grammars: formal, because they can be interpreted by a computer, cognitive, because a lot of studies prove that this is the way our brain works, for instance, when learning a language.
We believe that any Artificial intelligence performing natural language understanding tasks should be based on this cognitive modelling of the language.
Deep Learning and Syntax
Most of black box deep learning based models go directly from the raw input to the solution of the problem. For instance in a sentiment extraction application they will start from a sentence, transform it in a sequence of numbers and output a label such as POSITIVE,NEGATIVE, NEUTRAL.
The problem is that, however good the approach is, it is intrinsically unable to explain why a certain label was predicted. Our layered approach, on the contrary, uses intermediate, intelligible structures, including syntactic representation. In particular, it makes use of syntactic representations inspired by the Dependency Grammar: by oversimplifying, we could say that the system is able to recognize familiar concepts such as subject, direct object, temporal complements etc. These intermediate representations are crucial to interpretability: for instance in “This is the movie that Mary dislikes” we are able to justify a NEGATIVE label assigned to "movie” by showing that the AI understood that "this" is the direct object of "dislike" and that "this" refers to movie.
Deep learning, in our syntactic AI, is just a way to produce intelligible representations. On this sense our approach is inspired by Mrini et al. 2020 (“Rethinking Self-Attention:Towards Interpretability in Neural Parsing”)
And the Human?
According to our philosophy, it should always be possible to modify the output of an AI, in a symbolic (rule-based) way. So on top of our syntactic layer we have a layer of rules written by our linguist, which are used, for instance to correct certain errors, or enhance certain representations.
Those rules are also fundamental for guaranteeing the robustness of our solutions, which must be able to cope also with noisy texts (social networks, SMS, speech transcription).