A new approach called TakeFive leveraging Framester that transforms a text into a frame-oriented knowledge graph.
TakeFive performs dependency parsing, identifies the words that evoke lexical frames, locates the roles and fillers for each frame, runs coercion techniques, and formalises the results as a knowledge graph. This formal representation complies with the frame semantics used in Framester.
Frame-semantic parsing refers to the combined tasks of frame detection and Semantic Role Labeling (SRL) on natural language text.
Let us consider the following sentence from the Wall Street Journal (WSJ) (Available from https://catalog.ldc.upenn.edu/)
Despite recent declines in yields, investors continue to pour cash into money funds.
By performing frame-semantic parsing on this sentence, we recognize that the text fragment to pour evokes e.g. the frame Cause_motion from FrameNet, meaning that the sentence expresses an occurrence of this frame, and that the text fragments the investors and cash respectively denote the argument of the Agent.cause_motion role, and the argument of the Theme.cause_motion role, as both involved in the Cause_motion situation occurrence.
The most difficult part of frame-semantic parsing is SRL over the detected frames, from a reference resource. SRL is a known NLP task, and well established challenges, e.g. SemEval, regularly include it. There are different approaches to SRL using different reference resources for frames and roles. The most used resources are FrameNet, VerbNet and PropBank.
VerbNet is a broad coverage verb lexicon in English, with links to other data sources such as WordNet and FrameNet. It contains semantic roles and verb classes corresponding to Levin's classes, and including multiple verb senses. Verb classes can therefore be considered akin to word synsets. They generalise the verbs based on their shared syntactic behavior. These verb classes feature a simple two-layer hierarchy. For example, the verb conquer is a member of the class subjugate-42.3, and hence a sense Conquer_42030000 is created (the sense of conquer in that class).
VerbNet further contains semantic roles, which correspond to the relations between a verb sense and its arguments. Each class has multiple frames (either syntactic- or semantic-oriented), which define a list of predicates associated with their arguments. There is a (partial) morphism between syntactic and semantic frames, so that semantic roles ('arguments') are also associated with patterns that characterize the syntactic behavior of a verb in that class.
For example, the roles defined for the class subjugate-42.3 are Agent, Patient and Instrument meaning that an agent subjugates a patient with some instrument. Here Agent and Patient are necessary roles, and Instrument is an optional role. Verb senses help in determining if a particular verb instance conforms to the underlying semantics of the class. For the case of the verb conquer its only sense is included in the class subjugate-42.3. VerbNet maps verbs to FrameNet frames, e.g. the verb sense Conquer_42030000 is mapped to the frame Conquering. The version of VerbNet used in TakeFive evaluation is 3.1, and the data come from the RDF porting of VerbNet 3.1 that is included in Framester.
FrameNet contains descriptions and annotations of English words using Frame Semantics. FrameNet contains frames, which describe a situation, state or action. Each frame has semantic roles ('frame elements') that are much more semantically detailed than VerbNet ones. FrameNet also defines a subsumption relation between either frames or roles. The subsumption relation can be used to create a hierarchy of classes. The FrameNet version used here (1.5) is the RDF-OWL porting originally designed in our recent work, and now integrated in Framester, after a substantial revision with respect to frame and role hierarchies that removed cycles and conflicting subsumptions.
Each frame can be evoked by lexical Units (LUs) belonging to different parts of speech. In version 1.5, FrameNet covers about 10,000 lexical units and 1024 frames.
Let us consider the following sentence:
[The Spaniards]Conqueror [conquered]Lexical Unit [the Incas]Theme.
In the above example, The Spaniards is the argument (we will also refer to it as filler) of the role Conqueror and the Incas is the argument (or filler) of the role Theme and conquered is the lexical unit evoking the frame.
TakeFive addresses the problem of detecting the verb (lemma and VerbNet verb class), along with its arguments, and relating them to their corresponding VerbNet roles. In the example sentence shown above, TakeFive is able to detect the verb conquered, The Spaniards i.e. the filler of the VerbNet role Conqueror, and the Incas i.e. the filler of the VerbNet role Theme. Verbs, fillers and roles are therefore the entities we are looking for, and that we need to properly associate with the input sentence.
The backbone of TakeFive is a two step approach:
In other words, we aim at using background knowledge and formal reasoning in order to associate semantic roles with syntactic dependencies.
For a given input sentence we collect semantic information from Framester and syntactic information from Stanford CoreNLP: the usage of Word Frame Disambiguation (WFD) allows detecting the frames evoked by each verb when the verb is polysemous, whereas CoreNLP provides a dependency tree along with the POS tags. Verbs are located from the input sentence by looking at POS tags contained in the CoreNLP output, and semantic frames are identified by using Word Frame Disambiguation over polysemous verbs. The triple representation of the dependency tree returned by CoreNLP over the running example is the following:
In this example, nsubj, conquered-3, Spaniards-2 related to the verb conquered, and its Spaniards argument. Dependency types such as nsubj, dobj are generalized to interface roles through a set of heuristics we have defined to add a semantic layer on top of the syntactic one. Interface roles include Agent, Undergoer, Recipient, Eventuality, Oblique. In the running example, we apply a simple nsubj → Agent heuristic. Framester contains a role taxonomy that allows to (partly) map interface roles to the roles defined in existing resources. In the following we show a sample of the Framester role taxonomy, where the rolw agent subsumes entity.
The text is also tagged with the frames evoked by the verb in the sentence, and provided by the Word Frame Disambiguation API.
We have defined 23 simple heuristics to map CoreNLP dependencies triples to interface roles. We refer to:
By applying our heuristic nsubj → Agent to the dependency triple nsubj, conquered-3, Spaniards-2, we assign the role Agent to the argument Spaniards. As next step, we need to check if the CoreNLP interface role is compatible with the VerbNet interface role of the underlying verb (conquered in our example).
The following algorithm computes VerbNet interface and specific roles of extracted verbs from an input sentence.
The algorithm takes as an input a sentence, along with the CoreNLP and Framester information of the same sentence and generates a pair of VerbNet interface roles and VerbNet specific roles. The algorithm starts by iterating over each of the verbs and then checks if the verb is monosemous i.e., having only one meaning. If the verb is polysemous the algorithm retrieves all the verb senses associated with that verb (line 27). Otherwise, it searches for the frames evoked by this verb using the mappings defined in Framester (line 3). If there are no frames evoked by this verb, then all VerbNet verb senses are retrieved for this specific verb (line 4,5). In case of multiple frame evocations, the most specific frame with respect to the subsumption hierarchy of frames is chosen (line 8 to 17). If the frames evoked are not connected through this relation (meaning that they are independent) or if there are more than one specific frame, then the first VerbNet verb sense is selected. Finally, for each pair of verb and its selected VerbNet sense, a pair of VerbNet Interface and VerbNet Specific roles is returned by using the Framester mappings (line 30-34).
For our example sentence, the two dependency triples detected using CoreNLP are nsubj, conquered-3, Spaniards-2 and dobj, conquered-3, Incas-5. It turns out that, through our heuristics, we assign the CoreNLP interface roles Agent and Undergoer to, respectively, Spaniards and Incas. From the algorithm it follows that the detected verb of the sentence, conquered, is polysemous. Therefore its VerbNet sense Conquer_42030000 is extracted and associated with it.
Here we describe the procedure for checking the compatibility of CoreNLP interface roles detected using the 23 heuristics defined by us, and the VerbNet interface roles detected through the Algorithm 1 above. The objective here is to return, all roles and fillers for each argument of verbs from the input sentence. The process starts by considering the input sentence and the outputs of CoreNLP, Framester, the Algorithm 1 applied to the input sentence. Let
 Baccianella, Stefano, Andrea Esuli, and Fabrizio Sebastiani. 2010. SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. In Proceedings of the International Conference on Language Resources and Evaluation, LREC 2010, 17-23 May 2010, Valletta, Malta, European Language Resources Association.
 Baker, Collin, Michael Ellsworth, and Katrin Erk. 2007. Semeval’07 task 19: Frame semantic structure extraction. In Proceedings of the 4th International Workshop on Semantic Evaluations, SemEval ’07, pages 99–104, Association for Computational Linguistics, Stroudsburg, PA, USA.
 Baker, Collin F., Charles J. Fillmore, and John B. Lowe. 1998a. The berkeley framenet project In Proceedings of the 17th International Conference on Computational Linguistics - Volume 1, COLING ’98, pages 86–90, Association for Computational Linguistics, Stroudsburg, PA, USA.
 Baker, Collin F., Charles J. Fillmore, and John B. Lowe. 1998b. The Berkeley FrameNet Project. In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, COLING-ACL ’98, August 10-14, 1998, Universite de Montreal, Montreal, Quebec, Canada. Proceedings of the Conference., pages 86–90, Morgan Kaufmann Publishers / ACL.
 Bauer, Daniel and Owen Rambow. 2011. Increasing coverage of syntactic subcategorization patterns in framenet using verbnet. In Proceedings of the 2011 IEEE Fifth International Conference on Semantic Computing, ICSC ’11, pages 181–184, IEEE Computer Society, Washington, DC, USA.
 Carlson, Andrew, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R Hruschka Jr, and Tom M Mitchell. 2010. Toward an architecture for never-ending language learning. In AAAI, volume 5, page 3.
 Carreras, Xavier and Lluis Marquez. 2005. Introduction to the conll-2005 shared task: Semantic role labeling. In Proceedings of the Ninth Conference on Computational Natural Language Learning, CONLL ’05, pages 152–164, Association for Computational Linguistics, Stroudsburg, PA, USA.
 Carreras, Xavier and Lluis Marquez. 2004. Introduction to the conll-2004 shared task: Semantic role labeling.
 Corcoglioniti, Francesco, Marco Rospocher, and Alessio Palmeiro Aprosio. 2015. Extracting knowledge from text with pikes. In Proceedings of the ISWC 2015 Posters & Demonstrations Track, volume http://ceur-ws.org/Vol-1486/paper_66.pdf, CEUR Workshop Proceedings.
 Cuadros, Montse, Lluis Padro, and German Rigau. 2012. Highlighting relevant concepts from topic signatures. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), Istanbul, Turkey, May 23-25, 2012, pages 3841–3848, European Language Resources Association (ELRA).
 Das, D., N. Schneider, D. Chen, and N. A. Smith. 2010. SEMAFOR 1.0: A probabilistic frame-semantic parser. Technical report, Carnegie Mellon University. Fellbaum, Christiane, editor. 1998. WordNet: an electronic lexical database. MIT Press.
 Fillmore, Charles J. 1976. Frame semantics and the nature of language. Annals of the New York Academy of Sciences, 280(1):20–32.
 Furstenau, Hagen and Mirella Lapata. 2012. Semi-supervised semantic role labeling via structural alignment. Comput. Linguist., 38(1):135–171.
 Gangemi, Aldo. 2010. What’s in a Schema? Cambridge University Press, Cambridge, UK.
 Gangemi, Aldo, Mehwish Alam, Luigi Asprino, Valentina Presutti, and Diego Reforgiato Recupero. 2016a. Framester: A wide coverage linguistic linked data hub. In Knowledge Engineering and Knowledge Management - 20th International Conference, EKAW 2016, Bologna, Italy, November 19-23, 2016, Proceedings, volume 10024 of Lecture Notes in Computer Science, pages 239–254.
 Gangemi, Aldo and Valentina Presutti. 2010. A pattern science for the semantic web. Semantic Web, 1(1-2):61–68.
 Gangemi, Aldo, Valentina Presutti, Diego Reforgiato Recupero, Andrea Giovanni Nuzzolese, Francesco Draicchio, and Misael Mongiovi. 2016b. Semantic Web Machine Reading with FRED. Semantic Web.
 Giuglea, Ana-Maria and Alessandro Moschitti. 2006. Semantic role labeling via framenet, verbnet and propbank. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pages 929–936, Association for Computational Linguistics, Stroudsburg, PA, USA.
 Guha, Ramanathan, Vineet Gupta, Vivek Raghunathan, and Ramakrishnan Srikant. 2015. User modeling for a personal assistant. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, WSDM ’15, pages 275–284, ACM, New York, NY, USA.
 Hoffart, Johannes, Fabian M Suchanek, Klaus Berberich, and Gerhard Weikum. 2013. Yago2: A spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence, 194:28–61.
 Kilgarriff, Adam. 2014. The sketch engine: ten years on. Lexicography, pages 1–30.
 Kingsbury, Paul and Martha Palmer. 2002. From treebank to propbank. In In Language Resources and Evaluation.
 Kipper Schuler, Karin. 2005. Verbnet: A Broad-coverage, Comprehensive Verb Lexicon. Ph.D. thesis, Philadelphia, PA, USA. AAI3179808.
 de Lacalle, Maddalen Lopez, Egoitz Laparra, and German Rigau. 2014. Predicate Matrix: extending SemLink through WordNet mappings. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014), Reykjavik, Iceland, May 26-31, 2014., pages 903–909, European Language Resources Association (ELRA).
 Lang, Joel and Mirella Lapata. 2011. Unsupervised semantic role induction with graph partitioning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’11, pages 1320–1331, Association for Computational Linguistics, Stroudsburg, PA, USA.
 Lehmann, Jens, Chris Bizer, Georgi Kobilarov, Soren Auer, Christian Becker, Richard Cyganiak, and Sebastian Hellmann. 2009. DBpedia - A Crystallization Point for the Web of Data. Journal of Web Semantics, 7(3):154–165.
 Levin, Beth. 1993. English verb classes and alternations: A preliminary investigation. University of Chicago press.
 Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pages 55–60.
 Navigli, Roberto and Simone Paolo Ponzetto. 2012. BabelNet: The Automatic Construction, Evaluation and Application of a Wide-Coverage Multilingual Semantic Network. Artificial Intelligence, 193:217–250.
 Nuzzolese, A. G., A. Gangemi, V. Presutti, P. Ciancarini, and A. Musetti. 2012. Automatic Typing of DBpedia Entities. In Proc. of the International Semantic Web Conference (ISWC), Boston, MA, US.
 Nuzzolese, Andrea Giovanni, Aldo Gangemi, and Valentina Presutti. 2011. Gathering lexical linked data and knowledge patterns from FrameNet. In Proceedings of the 6th International Conference on Knowledge Capture (K-CAP 2011), June 26-29, 2011, Banff, Alberta, Canada, pages 41–48, ACM.
 Presutti, Valentina, Francesco Draicchio, and Aldo Gangemi. 2012. Knowledge extraction based on discourse representation theory and linguistic frames. In Knowledge Engineering and Knowledge Management - 18th International Conference, EKAW 2012, Galway City, Ireland, October 8-12, 2012. Proceedings, volume 7603 of Lecture Notes in Computer Science, pages 114–129, Springer.
 Roth, Michael and Mirella Lapata. 2016. Neural semantic role labeling with dependency path embeddings. CoRR, abs/1605.07515.
 Rouces, Jacobo, Gerard de Melo, and Katja Hose. 2015. Framebase: Representing n-ary relations using semantic frames. In European Semantic Web Conference, pages 505–521, Springer.
 Speer, Robert and Catherine Havasi. 2012. Representing general relational knowledge in conceptnet 5. In LREC, pages 3679–3686.
 Staiano, Jacopo and Marco Guerini. 2014. Depeche Mood: a Lexicon for Emotion Analysis from Crowd Annotated News. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, June 22-27, 2014, Baltimore, MD, USA, Volume 2: Short Papers, pages 427–433, The Association for Computer Linguistics.
 Titov, Ivan and Alexandre Klementiev. 2012. A bayesian approach to unsupervised semantic role induction. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL ’12, pages 12–22, Association for Computational Linguistics, Stroudsburg, PA, USA.
 Yi, Szu Ting, Edward Loper, and Martha Palmer. 2007. Can semantic roles generalize across genres. In In Proceedings of the Human Language Technology Conference/North American chapter of the Association for Computational Linguistics annual meeting (HLTNAACL2007.