RCLN : Knowledge Representation and Natural Language
Head of team: Nathalie Pernelle
The RCLN team brings together expertise in Natural Language Processing (NLP), corpus linguistics, semantic web, data mining and machine learning. These complementary skills give it an original positioning and enable it to carry out innovative work in the analysis and exploration of text corpora, as well as in the acquisition of knowledge from texts or knowledge graphs.
We work on approaches for dealing with the complexity of problems in texts or knowledge graphs, such as deep syntactic analysis, simultaneous extraction of semantic relations and named entities, or graph completion, using methods derived from deep learning, combinatorial optimization, data mining or inductive logic programming. We are particularly active in the processing of under-endowed or specialized languages, the detection of neologisms and the analysis of microblogs.
RCLN team plays an active role in the direction and work of the Labex “Empirical Foundations of Linguistics” (EFL), for which we coordinate the Computational Semantic Analysis axis. At a local level, we are members of the MathSTIC federation, where we co-leads the Optimization and learning applied to digital content axis.
The team is structured around three closely related research areas:
- Algorithms for syntactic and semantic analysis of texts
- Exploration and generation of texts
- Knowledge Acquisition.
These axes are complementary: syntactic-semantic analysis of corpora can serve as a basis for text mining and knowledge acquisition, and conversely, analysis algorithms can benefit from the knowledge acquired.
Scientific literature mining is defined as a transversal axis of the team. Indeed, research publications or shared datasets could be better exploited by intelligent systems to support and accelerate scientific efforts, by facilitating expert extraction, state-of-the-art text generation, scientific hypothesis generation, or to support or contradict certain hypotheses.