Table des matières

An information extraction method for assisting a citizen count of violent events in Mexico

Un método de extracción de información para cuantificar eventos violentos en México con organizaciones civiles

Goal

To assist with an information extraction method the citizen counters of the @menosdias project, who since 2010 have counted more than 50000 violent victims in Mexico. Counters are volunteers who must read the Mexican online press during one week in order to register violent events in the @menosdias blog and tweeter account. The goal of the project is to extract violent events from online sources and propose violent event candidates to the counter. The main output would be a blog post, a tweet and a record in a violent events database.

Methodology

  1. Parallel corpus construction. One corpus would be built from all the blog posts since 2015, the other from the tweets
  2. Corpus alignment by means of semantic similarity between blog posts and tweets
  3. Named entities annotation of places, person names and dates on the parallel corpus
  4. POS annotation and syntactic parsing.
  5. Semantic parsing and violent event extraction
  6. Violent event candidates validation by the human counter
  7. Training and testing data sets creation.
  8. First evaluation on testing datasets
  9. Second evaluation on @menosdias blank weeks (where no human volunteer was found to count)

Plan

First Iteration (EMNLP)

  1. Extract blog posts and tweets
    • Assigned to: Iván and Jorge
    • Due date: March 27
  2. Corpus alignment
  3. Calculate SOPA semantic similarity between blog posts in a chronological order
  4. Align blog and posts
    • Assigned to: Iván, Jorge, Davide
    • Due date: April 17
  5. Alignment evaluation
    • Assigned to: Iván and Jorge
  6. EMNLP paper writing
    • Deadline
      • (long papers): May 30
      • (short papers): June 15

Second Iteration

  1. Syntactic, semantic parsing and information extraction…
  2. Web application for @menosdias counters

Team

IIMAS - UNAM

http://golem.iimas.unam.mx/home.php?lang=en&sec=home

LIPN - Université Paris 13

https://lipn.univ-paris13.fr/en/

NAR

http://nuestraaparenterendicion.com/

Resources

References

On @menosdias

  1. Menos Días Aquí: tweeter account and blog

On violent event extraction

  1. The New War Correspondents: The Rise of Civic Media Curation in Urban Warfare, by Andres Monroy-Hernandez, Danah Boyd, Emre Kıcıman, Munmun De Choudhury, and Scott Counts, 23 February 2013.

Milestones