Différences

Ci-dessous, les différences entre deux révisions de la page.

--- equipes:rcln:cluster_tal:fred [2019/04/03 12:14]
rosse [Boxer and statistical models]
+++ equipes:rcln:cluster_tal:fred [2020/09/23 14:36] (Version actuelle)
@@ Ligne 364: / Ligne 364: @@
 </code>
 ==== Stanford Core NLP v.3.4.1 ====
-FRED works only with Core NLP is 3.4.1, so we should go to [https://stanfordnlp.github.io/CoreNLP/history.html Stanford Core NLP release history page] in order to downloado this specific version.
+FRED works only with Core NLP is 3.4.1, so we should go to [[https://stanfordnlp.github.io/CoreNLP/history.html | Stanford Core NLP release history page]] in order to download this specific version.
+<code>
  $ cd /opt/FRED/externals/tgz
  $ wget http://nlp.stanford.edu/software/stanford-corenlp-full-2014-08-27.zip
  $ cd ..
  $ unzip tgz/stanford-corenlp-full-2014-08-27.zip
+</code>
-Now we follow the "[https://stanfordnlp.github.io/CoreNLP/cmdline.html Using Stanford CoreNLP from the command line]" documentation page. So we go to Core NLP root directory and run...
+Now we follow the "[[https://stanfordnlp.github.io/CoreNLP/cmdline.html | Using Stanford CoreNLP from the command line]]" documentation page. So we go to Core NLP root directory and run...
+<code>
  $ java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt
@@ Ligne 415: / Ligne 416: @@
  Pipeline setup: 0,0 sec.
  Total time for StanfordCoreNLP pipeline: 1,6 sec.
+</code>
 According to the documentation, this command process a file called ''input.txt'' and produces an ''input.txt.xml'' file with POS, named entites and lemma annotation. There's some configuration to do (classpath, properties file) but we will wait until we know how exactly FRED uses Core NLP for further configuration.
 ==== Python interface to Stanford Core NLP tools v3.4.1 ====
-So we go back to the /opt/FRED/externals directory and clone [https://github.com/dasmith/stanford-corenlp-python.git Stanford Core NLP Python wrapper]
+So we go back to the /opt/FRED/externals directory and clone [[https://github.com/dasmith/stanford-corenlp-python.git | Stanford Core NLP Python wrapper]]
+<code>
  $ cd /opt/FRED/externals
  $ git clone https://github.com/dasmith/stanford-corenlp-python.git
+</code>
 We check python version and install pip and the wrapper dependencies:
+<code>
  $ python --version
  Python 2.7.6
  $ sudo apt-get install python-pip
  $ sudo pip install pexpect unidecode
+</code>
-The we follow [https://github.com/dasmith/stanford-corenlp-python/blob/master/README.md the python wrapper documentation], which specifies that Stanford Core NLP must be a child directory of the python wrapper, so we move our Core NLP directory inside the wrapper's directory:
+The we follow [[https://github.com/dasmith/stanford-corenlp-python/blob/master/README.md | the python wrapper documentation]], which specifies that Stanford Core NLP must be a child directory of the python wrapper, so we move our Core NLP directory inside the wrapper's directory:
+<code>
  $ pwd
  /opt/FRED/externals
@@ Ligne 436: / Ligne 443: @@
  $ mv stanford-corenlp-full-2014-08-27/ stanford-corenlp-python/
  $ ln -s stanford-corenlp-python/stanford-corenlp-full-2014-08-27/ stanford-corenlp
+</code>
 Then we launch the wrapper's server
+<code>
  $ python corenlp.py
  Loading Models: 5/5
  INFO:__main__:Serving on http://127.0.0.1:8080
+</code>
 There's a client.py program for testing the wrapper:
+<code>
  $ python client.py
  {u'sentences': [{u'parsetree': u'(ROOT (S (VP (NP (INTJ (UH Hello)) (NP (NN world)))) (. !)))',
@@ Ligne 460: / Ligne 471: @@
       from nltk.tree import Tree
  ImportError: No module named nltk.tree
+</code>
-So we must install [http://www.nltk.org/install.html NLTK] because it looks like a dependecy for the wrapper:
+So we must install [[http://www.nltk.org/install.html | NLTK]] because it looks like a dependecy for the wrapper:
+<code>
  $ sudo pip install -U nltk
  $ python
  >>> import nltk
  $ sudo python -m nltk.downloader -d /usr/local/share/nltk_data all
+</code>
 We test again
+<code>
  $ python client.py
  Traceback (most recent call last):
    File "client.py", line 18, in <module>
       tree = Tree.parse(result['sentences'][0]['parsetree'])
+</code>
 We still have an error, but it doesn't look bad, so we're going to ignore it and move on.
 ==== Babelfly ====
-I can't find any reference to entity disambiguation with Babelfly in FRED code, so I wont proceed to the installation from the [http://babelfy.org/download Babelfly download page]. Maybe it's a TODO to replace the Tagme calls (which are still inside FRED code) for Babelfly calls.
+I can't find any reference to entity disambiguation with Babelfly in FRED code, so I wont proceed to the installation from the [[http://babelfy.org/download Babelfly | download page]]. Maybe it's a TODO to replace the Tagme calls (which are still inside FRED code) for Babelfly calls.
+<code>
  $ find . -name "*.py" -exec grep -Hn agme {} \;
  ./fred-corenlp/server-fred-paris.py:139:        tagmeEntities = {}
@@ Ligne 494: / Ligne 511: @@
  $ find . -name "*.py" -exec grep -Hn abelfly {} \;
  [ ]
+</code>
 ===== Configuration =====
 First we will go to fred-corenlp directory
+<code>
  $ cd /opt/FRED/fred-corenlp
+</code>
-There, we will edit the ''config.py'' file to add ''candc'' path in line 5
+There, we will edit the ''config.py'' file to add //candc// path in line 5
- CANDC_BIN_PATH = '/opt/FRED/BoxerServer/candc'
+ ''CANDC_BIN_PATH = /opt/FRED/BoxerServer/candc''
-...and line 159 with the right ''nltk_data'' path
+...and line 159 with the right //nltk_data// path
- NLTK_PATH = '/usr/local/share/nltk_data'
+<code> NLTK_PATH = '/usr/local/share/nltk_data' </code>
 Then we go back to FRED root to edit Boxer's files
+<code>
   $ cd ..
   $ emacs -nw localboxerclient localboxerserver
+</code>
-In both files we set ''candc'' root:
+In both files we set //candc// root:
+<code>
  PREFIX=/opt/FRED/BoxerServer/candc
+</code>
-''localboxerserver'' should look like:
+//localboxerserver// should look like:
+<code>
  !/bin/bash
  PREFIX=/opt/FRED/BoxerServer/candc
  $PREFIX/bin/soap_server --server localhost:9000 --models $PREFIX/models/boxer --candc-printer boxer --candc-int-betas "0 0 0 0 0"
+</code>
 ===== Testing =====
 We first go to FRED root
+<code>
  $ cd /opt/FRED
+</code>
 And launch the boxer server
- $ sh launchboxerserver
+<code> $ sh launchboxerserver </code>
 We get a permission error, so we add execution attribute for both files
+<code>
  $ sudo chmod a+x BoxerServer/candc/bin/soap_client
  $ sudo chmod a+x BoxerServer/candc/bin/soap_server
+</code>
 And we get more errors:
+<code>
  $ sh launchboxerserver
  /opt/FRED/BoxerServer/candc/bin/soap_server: 1: /opt/FRED/BoxerServer/candc/bin/soap_server: Syntax error: "(" unexpected
@@ Ligne 537: / Ligne 566: @@
  /opt/FRED/BoxerServer/candc/bin/soap_client: 1: /opt/FRED/BoxerServer/candc/bin/soap_client: Syntax error: word unexpected (expecting ")")
  ERROR: file /tmp/boxer.ccg does not exist
+</code>
-''TODO'': recompile soap clients and server paying attention to parenthesis...
+**TODO**: recompile soap clients and server paying attention to parenthesis...