Table des matières

Notes on Semeval 2015 conference

Wednesday, June 3 2015

  1. In the plane I read Eneko's paper on STS task. I get the following quotations…
    • The top 10 systems [English task] did not show statistical significant variation among them.
    • Aligning words between sentences has been the most popular approach for the top three participants (DLS@CU, ExBThemis, Samsung). They use WordNet (Miller, 1995), Mikolov Embeddings (Mikolov et al., 2013; Baroni et al., 2014) and PPDB (Ganitkevitch et al., 2013).
    • Most teams add a machine learning algorithm to learn the output scores, but note that Samsung team did not use it in their best run.
    • Only about one fifth of the systems were un- supervised, among which, the top performing sys- tem, UMDuluth-BlueTeam-run1, was able to come within 0.1 correlation points from the top perform- ing system on Wikipedia and within 0.03 on the Newswire dataset. This relatively narrow gap suggests that unsupervised semantic textual similarity is aviable option for languages with limited resources
  2. System's worth studying and reading the paper
    • Samsung: 4th place (no significant statistical difference with 1st) without machine learning
    • ExBThemis: Best paper, 2d in english, 1st in spanish (they'll give a presentation today)
    • DLS@CU: Our alignment master Sultan (1st in english two years in a row)

Thursday, June 4 2015

Marco Baroni keynote conference on distributional semantic models

  1. He starts with some words about Adam Kilgariff:
    • “He was totally allergical to bullshit
    • “He wrote a great paper in 1997: I don't believe in word senses
    • “He never got a paper accepted in the main ACL conference (which is a scandal, given his contribution to the field)”
    • “Recently we were talking about all this young people doing deep learning: they don't make any distinction between language and vision”
  2. On the multimodal skip-gram model
    • Inspired by language learning by children
    • Using the Frank corpus
    • Training the model for standard distributional semantics with 20K words extracted from a baby language learning corpus.. they try to predict the objects the babies are learning (hat, ring) with a skip-gram model. The corpus is composed of words and object images (I don't totally understand it). The taks is called Matching words with objects
  3. “Look at the kitty! Look at the oink!”
    • The model tries to predict from the word kitty the right cute animal image
    • Concept learning, word learning, synonim learning inspired by the human cognition process
  4. Concentration lost
    • I lost concentration writing this chronicle while Marco speaks.. I just pick up some isolated words, like…
    • How MMSkipGram visualizes new concepts
    • As far as I understand, he's trying to train an MMSkipGram model in order to learn unkown words and associate them to images, inspired by the baby language learning process… looks intresting, but…
    • Models of language acquisition

SemEval-2015 Task 1: Paraphrase and Semantic Similarity in Twitter (PIT)

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval001.pdf

MITRE: Seven Systems for Semantic Similarity in Tweets

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval002.pdf

  1. 352 features combined with logistic regression
    • a lot of metrics, machine translation, biological metrics, etc!
    • word and phrase embeddings (unless you're under a rock you must have heard about word and phrase embeddings!)
    • Tweet alignments with embeddings (each atom to a corresponding atom on other side)
      • we compute cosin similarity between vectors to score a candidate aligned pair
      • Linear programming
      • Recurrent neural networks (RNN) had the best score from all their systems

Póster session

ExB Themis: Extensive Feature Extraction from Word Alignments for Semantic Textual Similarity

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval046.pdf

  1. Alignmente, word embeddings, SVN
  2. Preprocessing
    • Tokenizatin, case correction, unsupervised POS tagging, lemmatization, detection of dataset-specific stop words, identification of measurement & temporal expressions, state of the art NER (Haniing et al 2014, winner of GermEval-2004)
  3. Non-alignment features
    • Character n-grams, pathlen similarity, numbers overlaps, word n-gram similarity, sentence length, average word lenght
  4. Alignement features
    • Diretion dependent m:n alignements of types EQUI, OPPO, SPE, SIM, EL, NOALI
    • Align in strict order: NE, Normalized temporal expresions, normalized measurements, arbitrary token n-grams 1-5, negations remaining content words
    • Proportion features for EQUI, OPPO, SPE, REL
    • Binned frequency features for OPPO, SPE, REL, NOALI
    • Han et al 2013 align-and-penalize features “good alignment vs bad alignment”
  5. A robust system across all the corpus. 2d in english, 1st in spanish with a huge gap with the next performing system
  6. SVR using 40 alignment features aand 51 non alignment features
  7. Questions:
    • Eneko asks if there were ablation test in order to estimate which features were better: they didn't.
    • “If two sentences have similar NE they should be aligned first”

SemEval-2015 Task 3: Answer Selection in Community Question Answering

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval047.pdf

VectorSLU: A Continuous Word Vector Approach to Answer Selection in Community Question Answering Systems

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval048.pdf

SemEval-2015 Task 13: Multilingual All-Words Sense Disambiguation and Entity Linking

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval049.pdf

  1. Babelnet, Tokenized, POS tagged documents in four languages
  2. Example: the concept of medicine as a drug (variating specification according to the source: Wikipedia, wordnet, etc.)
  3. Very intresting dataset in english, spanish and italian with lots of ambiguous terms
  4. Resources used by the participants:
    • DBPediaSpotlike, Wikpedia Miner, evolutionary game theory using a non cooperative multiplayer game setting, Tagme, EL services, Babelnet
    • optimizing multiple objective functions, document monosemy plus personalized page rank
  5. The winner approach: content words tagged by exploiting their translations in other languages
    • The winer approach comes from french Lab LIMSI, and in particular from the charming Marianita Apidianaki

LIMSI: Translations as Source of Indirect Supervision for Multilingual All-Words Sense Disambiguation and Entity Linking

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval050.pdf

  1. LIMSI system exploitts the parallelism of the miltilingual test data
  2. assumption of sense correspondence between a word and its translation in context (Diab and Resnik, 2002)
  3. sentence and word (lemma) level alignements (Hunaligh, GIZA++)
  4. keep spanish translation for english words, english translation for spanish and italian words
  5. sense selection for word(w) in context
    • the synsets of w in babelnet are found
    • sw is filered to keep only synsets that contain both w and its aligned trasnlation t in this context
    • if theres one more sens, synsets ranked using the default sense comparator in babelnet api and keep the highest ranked synset
  6. BFS helps to find also the wrong senses…
  7. LIMSI systems needs no training, it only relies on alignment and sense ranking
  8. weaker performance for spanish and italian due to the problematic sense ranking in these language (performed by Babelnet)
  9. when multiple senses are retained after filtering by alignment
  10. BFS is needed
  11. alignment-based filtering remains benefical as the translation might occur in only one synset
  12. BFS= Babelnet First Sense
  13. BFS prediction are often wrong, especially in Spanish and Italian
  14. Perspectives: experiment with alignments provided by MT systems, train a WSD systems on data annotated by the alignment-based method
    • check out the METEOR-WSD and RATATOUILLE metrics from the WMT shared task

SemEval-2015 Task 14: Analysis of Clinical Text

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval051.pdf

UTH-CCB: The Participation of the SemEval 2015 Challenge – Task 14

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval052.pdf

  1. Disorder Entity recognition
    • Vector space model based, word-embeddings (MIMIC II corpus), CRF, SSVM and Meta Map
  2. Disorder slot filling
    • SVM, ngream ffeatures, lexicon features, dependency relation features

SemEval-2015 Task 15: A CPA dictionary-entry-building task

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval053.pdf

  1. CPA: Corpus pattern analysis.
    • corpus driven technique for mapping meaning onto words in text
    • tools and resources to identify and represent unambiguosly the main semantic patterns in which words are used
    • Sense Discriminative Patterns
  2. CPA Parsing, CPA Clustering, CPA lexicography
    • Input: plain text with the target verb highlighted, output: specific to subtask, similar to the PDEV (pattern dictionnary of english verbs) entries
  3. Corpus
    • MICROCHECK (29 verbs 378 patterns, 4529 annotated sentences)
    • WINGSPREAD (93, 856, 12440 annotated sentences: ~10K learning, ~400 testing)
  4. ACL-2015 tutorial: Patterns for semantic processing

BLCUNLP: Corpus Pattern Analysis for Verbs Based on Dependency Chain

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval054.pdf

SemEval-2015 Task 9: CLIPEval Implicit Polarity of Events

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval077.pdf

SemEval-2015 Task 10: Sentiment Analysis in Twitter

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval078.pdf

  1. Subtasks:
    • Phrase level sentiment
    • MEssage level sentiment
    • Topic level sentiment

[…]

unknown

  1. input: a list of termis; output: the same list of termis with a polarity score.
  2. MaxDiff method of annotation. Which term is the most positive and which is the leas positive?
  3. rotten is less positive than; #hapiness is ore positive than

UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval079.pdf

  1. Very intresting an didactic paper on deep learning to nlp
  2. The key to success is the initialization of the NN
  3. Deep Learning models in NLP
    • model words as a vectors
    • learn compositional rules to represent sentences
  4. ConvNet architecture
    • sentence matrix, word embeddings, phrase indicator features, convolutional feature map, pooled representation, softmax
  5. For the message classification task you need to add more features for each word
  6. Models for twitter sentiment analysis
    • SVM with various n-gram, char-gram, lexicon features
    • State of the art model (NRC) in Semeval 13 and 14
  7. Deep learning models have shown excellent results on many NLP sentence classification tasks but failed so far to eat carefully engineerd methods
  8. Tree step pre-training process
    • Pre-train word vectors using unsupervised langage model (word2vec) on 50M tweets
  9. Train the network on a large distant supervised corpora on 10M tweets
  10. Fine tune the network on the supervised dataset (about 10k tweets)
  11. Major novelty intializing network with weights
  12. ConvNet params
    • wide convolution, max-pooling, filter width 5
    • word embeddings dimensionality 100
    • number of feature maps 300
  13. Importance of pre-training: Three different experiments: random, unsupervised, distant
  14. careful weights

SemEval-2015 Task 11: Sentiment Analysis of Figurative Language in Twitter

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval080.pdf

CLaC-SentiPipe: SemEval2015 Subtasks 10 B,E, and Task 11

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval081.pdf

SemEval-2015 Task 12: Aspect Based Sentiment Analysis

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval082.pdf

  1. restaurant and laptops
  2. intended to capture contradictory polarity sentences

NLANGP: Supervised Machine Learning System for Aspect Category Classification and Opinion Target Extraction

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval083.pdf

Blody looooong (but very interesting) research day! #ContradictoryPolarity

Friday, June 5

SemEval-2015 Task 4: TimeLine: Cross-Document Event Ordering (or Newsreader project)

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval132.pdf

SPINOZA_VU: An NLP Pipeline for Cross Document TimeLines

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval133.pdf

SemEval-2015 Task 5: QA TempEval - Evaluating Temporal Information Understanding with Question Answering

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval134.pdf

HLT-FBK: a Complete Temporal Processing System for QA TempEval

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval135.pdf

SemEval-2015 Task 6: Clinical TempEval

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval136.pdf

BluLab: Temporal Information Extraction for the 2015 Clinical TempEval Challenge

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval137.pdf

SemEval 2015, Task 7: Diachronic Text Evaluation

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval147.pdf

UCD : Diachronic Text Classification with Character, Word, and Syntactic N-grams

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval148.pdf

SemEval-2015 Task 8: SpaceEval

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval149.pdf

SpRL-CWW: Spatial Relation Classification with Independent Multi-class Models

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval150.pdf

SemEval-2015 Task 17: Taxonomy Extraction Evaluation (TExEval)

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval151.pdf

INRIASAC: Simple Hypernym Extraction Methods

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval152.pdf

SemEval 2015 Task 18: Broad-Coverage Semantic Dependency Parsing

http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval153.pdf

SemEval-2016 Task Announcements and closing session

STS

Interpretable STS

Community QA

Sentiment Analysis in Twitter

Aspect-based sentiment analysis (ABSA)

Detecting stance in tweets

Sentiment intensity of English and Arabic

Meaning representation parsing

Chinese semantic dependency parsing

RAS

Semantic Analysis Track

Complex Word Identification

Clinical TempEval

Taxonomy extraction

semantic taxonomy enrichment

Closing session

Ideas

  1. Use Sultan to align @menosdias tweets
  2. While drinking cofee with charming Houda Bouamor (from Qatar Carnegie Mellon) we had the idea of a multilingual summarizer (french-LIPN, spanish-IIMAS, arabic-Qatar) based on alignement and moderate generation (LORIA). She said she could get some funding for a one year project from Qatar.
  3. De pronto pienso que @menosdías quizá daría para una tarea de Semeval
  4. más le pienso y más me digo: hagamos chunking y alineación con @menosdías usando pura similaridad semántica y ya nos sale el paper (y la extracción de entidades, al menos de los tuits)
  5. pensando en AGESS (Automatic Generation of State of the Arts)… creo que deberíamos lanzar a un doctorando a que trabaje en el tiempo (y participe en las tareas temporales de Semeval)
  6. Using corefeence reduces he number of unknown relations significantly
  7. SpaceEval might be useful for the GolemGenFred project
  8. Orientation Link (OLink) describe non topological relationships between spacial elements (the chair is in front of the couch)
  9. Data sources:
    • Degree Confluence Project (travelers web log) DCP
    • CLEF
  10. SE (special entities), SS (spatial signal), MI (motion signal)
  11. Three configurations: un-annotated text, manually annotated spatial elements, manually annotated spatial elements with attributes
  12. word2vec… and wordembedding
  13. Can Selectional Preferences Help Automatic Semantic Role Labeling? Shumin Wu and Martha Palmer (intresting poster, semantic role labelling with LDA)