ACL-RelAcS



ACL-RelAcS is a corpus designed for semantic RELation ACquiSition (extraction and classification) in the scientific domain. It is composed of abstracts and introductions of about 11.000 papers from the ACL Anthology Corpus, with automatically annotated domain concepts in the entire corpus, and manually annotated semantic relations in 500 abstracts.
The corpus was developed at LIPN Université Paris 13 and at LATTICE, CNRS with funding from LABEX-EFL.

SemEval 2018 Dataset

Together with the ACL-RD TEC corpus, ACL-RelAcS was used in SemEval 2018 Task 7: Semantic Relation Extraction and Classification in Scientific Papers. Note that the SemEval task and dataset use the same abstracts but a different relation typology. If you are interested in the SemEval dataset, please go to the dedicated website.

Annotation

Concepts were identified and annotated fully automatically, based on a combination of terminology extraction and available ontological resources. The annotation relies on the Saffron Knowledge Extraction Framework (domain models for computer science and natural language processing), BabelNet and the terminology extractor TermSuite.
A typology of semantic relations between concepts is also proposed. This typology, consisting of 18 domain-specific and 3 generic relations, is the result of a corpus-based investigation of the text sequences occurring between concepts in sentences.
A sample of ~500 abstracts from the corpus is manually annotated with semantic relations. Only explicit relations are taken into account, so that the data could serve to train or evaluate pattern-based semantic relation classification systems.

If you are using ACL-RelAcs for academic research please cite:

If you are using the SemEval Task 7 dataset for academic research please cite:

  • Kata Gábor, Davide Buscaldi, Anne-Kathrin Schumann, Behrang QasemiZadeh, Haïfa Zargayouna, Thierry Charnois: Semeval-2018 Task 7: Semantic Relation Extraction and Classification in Scientific Papers. In Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval-2018), New Orleans, LA, USA, June 2018.

Download

  • Full ACL-RelAcS corpus : 11.000 abstracts with automatic concept annotation can be downloaded here.
  • Manually annotated semantic relations: 350 (training) + 150 (test) abstracts can be downloaded here.
  • Fine-grained semantic relation typology used for ACL-RelAcS.
  • Responsive image
  • Creative Commons License
    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Licence.