Les deux révisions précédentes
Révision précédente
Prochaine révision
|
Révision précédente
|
equipes:rcln:ancien_wiki:documents:dynamic_semantic_annotation [2019/04/04 15:15] rosse [Abstract] |
equipes:rcln:ancien_wiki:documents:dynamic_semantic_annotation [2020/11/23 18:42] (Version actuelle) garciaflores ↷ Page déplacée de equipes:rcln:documents:dynamic_semantic_annotation à equipes:rcln:ancien_wiki:documents:dynamic_semantic_annotation |
| ====== Dynamic semantic annotation ====== |
| |
===== Keywords ===== | ===== Keywords ===== |
Natural Language Engineering, Semantic Annotation, Content Management, Knowledge Engineering, Semantic Web | Natural Language Engineering, Semantic Annotation, Content Management, Knowledge Engineering, Semantic Web |
| |
===== Abstract ===== | ===== Abstract ===== |
The semantic annotation of documents plays a key role for many applications of textual content management (e.g. navigation, semantic information retrieval, publication). Semantic Annotation consists in enriching a text with metadata which semantics is given by a formal semantic model (e.g. indexing language, thesaurus, ontology) [ B. Popov, A. Kiryakov, D. Ognyanoff, D. Manov, A. Kirilov. « Kim – a semantic platform for information extraction and retrieval ». //Natural Language Engineering//, 10(3-4):375–392, 2004. ] <ref name="kirkayov"> A. Kiryakov, B. Popov, I. Terziev, D. Manov, and D. Ognyanoff. « Semantic annotation, indexing, and retrieval ». //Journal of Web Semantics//, 2(1):49–79, 2004.</ref> <ref name="uren"> V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta, F. Ciravegna. « Semantic annotation for knowledge management: Requirements and a survey of the state of the art ». //Journal of Web Semantics//, 4, 2006.</ref>. A formal semantic representation is thus associated with the text so that search engines or software agents can jointly exploit the textual content (plain text search, distributional measures) and the formal semantics associated with it. | The semantic annotation of documents plays a key role for many applications of textual content management (e.g. navigation, semantic information retrieval, publication). Semantic Annotation consists in enriching a text with metadata which semantics is given by a formal semantic model (e.g. indexing language, thesaurus, ontology) [(B. Popov, A. Kiryakov, D. Ognyanoff, D. Manov, A. Kirilov. « Kim – a semantic platform for information extraction and retrieval ». //Natural Language Engineering//, 10(3-4):375–392, 2004.)] [(kirkayov>A. Kiryakov, B. Popov, I. Terziev, D. Manov, and D. Ognyanoff. « Semantic annotation, indexing, and retrieval ». //Journal of Web Semantics//, 2(1):49–79, 2004.)] [(uren>V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta, F. Ciravegna. « Semantic annotation for knowledge management: Requirements and a survey of the state of the art ». //Journal of Web Semantics//, 4, 2006.)]. A formal semantic representation is thus associated with the text so that search engines or software agents can jointly exploit the textual content (plain text search, distributional measures) and the formal semantics associated with it. |
| |
The first generation annotation tools are quite simple. They often merely bind references to named entities identified in the texts to existing instances or new instances of concepts in an ontology <ref name="magnini">B. Magnini, E. Pianta, O. Popescu, M. Speranza. « « Ontology population from textual mentions: Task definition and benchmark ». In //Proceedings of the OLP2 workshop on Ontology Population and Learning//, Sidney, Australia, 2006.</ref> <ref name="giuliano">C. Giuliano, A. Gliozzo. « Instance-based ontology population exploiting named-entity substitution ». In //Proceedings of the 22nd International Conference on Computational Linguistics// (Coling 2008), pages 265–272, Manchester, August 2008.</ref>. However, the development of specialized applications of content management and linked data calls for renewed methods of semantic annotation: we need methods and tools that provide a richer expressiveness of annotation (e.g. annotation wrt . concepts and relations and not only instances) while being robust, generic and adaptable to different domains and use cases. | The first generation annotation tools are quite simple. They often merely bind references to named entities identified in the texts to existing instances or new instances of concepts in an ontology [(magnini>B. Magnini, E. Pianta, O. Popescu, M. Speranza. « « Ontology population from textual mentions: Task definition and benchmark ». In //Proceedings of the OLP2 workshop on Ontology Population and Learning//, Sidney, Australia, 2006.)] [(giuliano>C. Giuliano, A. Gliozzo. « Instance-based ontology population exploiting named-entity substitution ». In //Proceedings of the 22nd International Conference on Computational Linguistics// (Coling 2008), pages 265–272, Manchester, August 2008.)]. However, the development of specialized applications of content management and linked data calls for renewed methods of semantic annotation: we need methods and tools that provide a richer expressiveness of annotation (e.g. annotation wrt . concepts and relations and not only instances) while being robust, generic and adaptable to different domains and use cases. |
| |
===== Goal ===== | ===== Goal ===== |
- To extend the scope of the update to the case where aligned semantic resources are being used in parallel for the annotation process. | - To extend the scope of the update to the case where aligned semantic resources are being used in parallel for the annotation process. |
| |
While the PhD student can rely on existent works on ontology population <ref name="magnini"/> <ref name="giuliano"/>, semantic annotation <ref>Borislav Popov, Atanas Kiryakov, Damyan Ognyanoff, Dimitar Manov, and Angel Kirilov. //Kim – a semantic platform for information extraction and retrieval//. | While the PhD student can rely on existent works on ontology population [(magnini)] [(giuliano)], semantic annotation [(Borislav Popov, Atanas Kiryakov, Damyan Ognyanoff, Dimitar Manov, and Angel Kirilov. //Kim – a semantic platform for information extraction and retrieval//. |
Nat. Lang. Eng., 10(3-4) :375–392, 2004</ref> <ref name="kirkayov"/> <ref name="uren"/> and semantic referencial evolution, specially in ontologies <ref>Pieter De Leenheer and Tom Mens. //Ontology evolution : State of the art | Nat. Lang. Eng., 10(3-4) :375–392, 2004)] [(kirkayov)] [(uren)] and semantic referencial evolution, specially in ontologies [(Pieter De Leenheer and Tom Mens. //Ontology evolution : State of the art |
and future directions//. In Martin Hepp, Pieter De Leenheer, Aldo de Moor, | and future directions//. In Martin Hepp, Pieter De Leenheer, Aldo de Moor, |
and York Sure, editors, | and York Sure, editors, |
Ontology Management : Semantic Web, Semantic | Ontology Management : Semantic Web, Semantic |
Web Services, and Business Applications | Web Services, and Business Applications |
, pages 131–176. Springer, 2007</ref> <ref>Zied Sellami, Valérie Camps, and Nathalie Aussenac-Gilles. //Dynamo-mas : | , pages 131–176. Springer, 2007)] [(Zied Sellami, Valérie Camps, and Nathalie Aussenac-Gilles. //Dynamo-mas : |
a multi-agent system for ontology evolution from text.// | a multi-agent system for ontology evolution from text.// |
J. Data Semantics, | J. Data Semantics, |
2(2-3) :145–161, 2013.</ref> <ref>Rim Djedidi and Marie-Aude Aufaure. //Ontology change management.// In | 2(2-3) :145–161, 2013.)] [(Rim Djedidi and Marie-Aude Aufaure. //Ontology change management.// In |
A. Paschke, H. Weigand, W. Behrendt, K. Tochtermann, and T. Pellegrini, | A. Paschke, H. Weigand, W. Behrendt, K. Tochtermann, and T. Pellegrini, |
editors, | editors, |
09), Proceedings of I-KNOW'09 and I-SEMANTICS?09 | 09), Proceedings of I-KNOW'09 and I-SEMANTICS?09 |
, pages 611–621, | , pages 611–621, |
Graz, Austria, September 2009. Verlag der Technischen Universitt Graz</ref>, she/he should extend and structure them according with the project goals. | Graz, Austria, September 2009. Verlag der Technischen Universitt Graz)], she/he should extend and structure them according with the project goals. |
| |
The student will rely as well on knowledge acquisition from text tools developed by the RCLN team (Terminae <ref name="terminae">Références | The student will rely as well on knowledge acquisition from text tools developed by the RCLN team (Terminae [(terminae>Nathalie Aussenac-Gilles, Sylvie Despres, and Sylvie Szulman. //The TERMINAE Method and Platform for Ontology Engineering from texts.// In Paul |
[[1]] Nathalie Aussenac-Gilles, Sylvie Despres, and Sylvie Szulman. //The TERMINAE Method and Platform for Ontology Engineering from texts.// In Paul | |
Buitelaar and Philipp Cimiano, editors, | Buitelaar and Philipp Cimiano, editors, |
Bridging the Gap between Text and | Bridging the Gap between Text and |
Knowledge - Selected Contributions to Ontology Learning and Population | Knowledge - Selected Contributions to Ontology Learning and Population |
from Text | from Text |
, pages 199–223. IOS Press, janvier 2008.</ref> and SemEx <ref>François Lévy, Adeline Nazarenko, Abdoulaye Guissé, Nouha Omrane, and | , pages 199–223. IOS Press, janvier 2008.)] and SemEx [(François Lévy, Adeline Nazarenko, Abdoulaye Guissé, Nouha Omrane, and |
Sylvie Szulman. //An environment for the joint management of written policies and business rules.// In | Sylvie Szulman. //An environment for the joint management of written policies and business rules.// In |
Proceedings of the International Conference on | Proceedings of the International Conference on |
Tools with Artificial Intelligence (IEEE-ICTAI 10) | Tools with Artificial Intelligence (IEEE-ICTAI 10) |
, pages 142–149, 2010</ref>) and on the experience of semantic corpus building, whether automatic <ref name="yue2">Yue Ma, Adeline Nazarenko, and Laurent Audibert. //Formal description | , pages 142–149, 2010)]) and on the experience of semantic corpus building, whether automatic [(yue2>Yue Ma, Adeline Nazarenko, and Laurent Audibert. //Formal description |
of resources for ontology-based semantic annotation//. In | of resources for ontology-based semantic annotation//. In |
Proceedings of the | Proceedings of the |
International Conference on Language Resources and Evaluation (LREC | International Conference on Language Resources and Evaluation (LREC |
2010) | 2010) |
, Malta, May 2010. ELRA</ref><ref name="yue1">Yue Ma, François Lévy, and Sudeep Ghimire. //Reasoning with Annotations of Texts.// In | , Malta, May 2010. ELRA)] [(yue1>Yue Ma, François Lévy, and Sudeep Ghimire. //Reasoning with Annotations of Texts.// In |
The 24th Florida Artificial Intelligence Research Society | The 24th Florida Artificial Intelligence Research Society |
Conference (FLAIRS-24) | Conference (FLAIRS-24) |
, pages 192–197, États-Unis, May 2011.</ref> or manual <ref name="fort1">Karën Fort, Adeline Nazarenko, and Sophie Rosset. //Modeling the Complexity of Manual Annotation Tasks : a Grid of Analysis.// In | , pages 192–197, États-Unis, May 2011.)] or manual [(fort1>Karën Fort, Adeline Nazarenko, and Sophie Rosset. //Modeling the Complexity of Manual Annotation Tasks : a Grid of Analysis.// In |
Proceedings of | Proceedings of |
the 24th International Conference on Computational Linguistics (COLING | the 24th International Conference on Computational Linguistics (COLING |
2012) | 2012) |
, Mumbai, India, December 2012.</ref>. She/he could also make use of related works on semantic information retrieval <ref>Haïfa Zargayouna. | , Mumbai, India, December 2012.)]. She/he could also make use of related works on semantic information retrieval [(Haïfa Zargayouna. |
//Indexation sémantique de documents XML.// Thèse de | //Indexation sémantique de documents XML.// Thèse de |
doctorat. Université Paris-Sud, Déc. 2005.</ref>. | doctorat. Université Paris-Sud, Déc. 2005.)]. |
| |
At first, it will be necessary to work on classic ontologies ant thesaurus modes with the traditional standards (SKOS, OWL-DL) and on well established technologies for but other semantic models could be eventually proposed. | At first, it will be necessary to work on classic ontologies ant thesaurus modes with the traditional standards (SKOS, OWL-DL) and on well established technologies for but other semantic models could be eventually proposed. |
The work will be supervised by Pr. Adeline Nazarenko and Pr. Francois Levy. | The work will be supervised by Pr. Adeline Nazarenko and Pr. Francois Levy. |
| |
The student will be integrated in the RCLN team and benefit from its expertise in natural language processing, knowledge engineering and semantic web. In particular, RCLN has a solid experience in semantic annotation (manual annotation <ref>K. Fort. //Les ressources annotées, un enjeu pour l’analyse de contenu : vers une méthodologie de l’annotation manuelle de corpus. Thèse d'informatique//, Université Paris 13 – Sorbonne Paris Cité, Villetaneuse, France, 2012.</ref> <ref name="fort1"></ref> or based on machine learning <ref name="yue1"></ref>, formalisms and resources for annotation <ref>Y. Ma, A. Nazarenko, L. Audibert. //Formal description of resources for ontology-based semantic annotation//. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2010), Malta, May 2010. ELRA.</ref> <ref> N. Omrane, A. Nazarenko, P. Rosina, S. Szulman, C. Westphal. //Lexicalized ontology for a business rules management platform: An automotive use case//. In Proceedings of the 5th International Symposium on Rules, International Business Rules Forum (RuleMF@BRF), Ft Lauderdale, Florida, USA, November 2011.</ref>) and text-based ontology design <ref name="terminae"></ref>. It also knows how to integrate those methods of acquisition and annotation in content analysis tools <ref>A. Guissé, F. Lévy, A. Nazarenko. //Un moteur sémantique pour explorer des textes réglementaires//. In Actes des 22èmes journées francophones d'Ingénierie des Connaissances, Chambéry, 2011.</ref> <ref>F. Lévy, A. Nazarenko, A. Guissé. //Annotation, indexation et parcours de documents numériques//. Revue des Sciences et Technologies de l'Information, 13(3/2010):121–152, 2010.</ref> <ref>A. Nazarenko, A. Guissé, F. Lévy, N. Omrane, S. Szulman. //Integrating Written Policies in Business Rule Management Systems//. In Rule-Based reasoning, Programming, and Applications, volume 6826 of Lecture Notes in Computer Science, pages 99–113, Barcelona, Espagne, 2011.</ref>. | The student will be integrated in the RCLN team and benefit from its expertise in natural language processing, knowledge engineering and semantic web. In particular, RCLN has a solid experience in semantic annotation (manual annotation [(K. Fort. //Les ressources annotées, un enjeu pour l’analyse de contenu : vers une méthodologie de l’annotation manuelle de corpus. Thèse d'informatique//, Université Paris 13 – Sorbonne Paris Cité, Villetaneuse, France, 2012.)] [(fort1)] or based on machine learning [(yue1)], formalisms and resources for annotation [(Y. Ma, A. Nazarenko, L. Audibert. //Formal description of resources for ontology-based semantic annotation//. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2010), Malta, May 2010. ELRA.)] [( N. Omrane, A. Nazarenko, P. Rosina, S. Szulman, C. Westphal. //Lexicalized ontology for a business rules management platform: An automotive use case//. In Proceedings of the 5th International Symposium on Rules, International Business Rules Forum (RuleMF@BRF), Ft Lauderdale, Florida, USA, November 2011.)]) and text-based ontology design [(terminae")]. It also knows how to integrate those methods of acquisition and annotation in content analysis tools [(A. Guissé, F. Lévy, A. Nazarenko. //Un moteur sémantique pour explorer des textes réglementaires//. In Actes des 22èmes journées francophones d'Ingénierie des Connaissances, Chambéry, 2011.)] [(F. Lévy, A. Nazarenko, A. Guissé. //Annotation, indexation et parcours de documents numériques//. Revue des Sciences et Technologies de l'Information, 13(3/2010):121–152, 2010.)] [(A. Nazarenko, A. Guissé, F. Lévy, N. Omrane, S. Szulman. //Integrating Written Policies in Business Rule Management Systems//. In Rule-Based reasoning, Programming, and Applications, volume 6826 of Lecture Notes in Computer Science, pages 99–113, Barcelona, Espagne, 2011.)]. |
| |
The student will work at LIPN (University Paris 13 - Sorbonne Paris Cité & CNRS) where he/she will be assigned a desk. He/she will have access to local facilities and data resources. | The student will work at LIPN (University Paris 13 - Sorbonne Paris Cité & CNRS) where he/she will be assigned a desk. He/she will have access to local facilities and data resources. |
| |
===== References ===== | ===== References ===== |
<references/> | |
| |
| ~~REFNOTES~~ |
===== Conferences and summer schools ===== | ===== Conferences and summer schools ===== |
* [[http://ldq.semanticmultimedia.org/|LDQ 2015]]: 2nd Workshop on Linked Data Quality co-located with ESWC 2015, Portorož, Slovenia (Deadline: March 6, 2015) | * [[http://ldq.semanticmultimedia.org/|LDQ 2015]]: 2nd Workshop on Linked Data Quality co-located with ESWC 2015, Portorož, Slovenia (Deadline: March 6, 2015) |
| |
===== PhD topic (english) ===== | ===== PhD topic (english) ===== |
[[Media:SujetThese_RCLNdynamique_V4_en.pdf|Dynamic semantic annotation: analysis, modeling and implementation]] | {{ :equipes:rcln:documents:sujetthese_rclndynamique_v4_en.pdf | Dynamic semantic annotation: analysis, modeling and implementation}} |
===== Sujet de thèse en français ===== | ===== Sujet de thèse en français ===== |
[[Media:SujetThese_RCLNdynamique_V4.pdf|Dynamique de l’annotation sémantique : analyse, modélisation et mise en œuvre]] | |
| |
| {{ :equipes:rcln:documents:sujetthese_rclndynamique_v4.pdf |Dynamique de l’annotation sémantique : analyse, modélisation et mise en œuvre}} |