With the emergence of Deep Learning and subsequent evolutions in the field of NLP, the most common pattern of development of any language aware application is:

  • Build an annotated corpus.
  • Run some AI algorithm to learn a model.
  • Use the model in real life applications.

The problem with this pattern is the modifiability of the model. The model is created in an “artificial” context, with a selected corpus and high level (usually expensive) annotations. When the model is deployed in a “real life” context, it is likely to need corrections and additions. It is also likely the business context evolves, so it should the model.

However, manual modification and evolution of a trained model are difficult tasks.

Manual modification is made impossible by the fact that at the end of the story what is learned is a set of weights over arrays of numbers: no operator, even the more expert ones, would ever be able to intervene on these weights to change the model behavior. The only possibility is to change the annotated corpus, which implies a new, potentially expensive, annotate-train cycle with sometimes unpredictable results.

Evolution is also problematic, as it implies building a new annotated corpus incorporating the new business context and running the learning pattern again. Basically, it means “redo everything from scratch”.

Our researchers are studying more and more user and business friendly models to couple the power of Deep Learning and the needs arising from real life NLP applications. The internal research project addressing this issue is ADA (Automatic domain adaptivity), and examples of how to couple Deep Learning and flexible rules systems can be found in the ASI section.