Différences

Ci-dessous, les différences entre deux révisions de la page.

--- equipes:rcln:cluster_tal:fred [2019/04/03 10:34]
rosse [C&C]
+++ equipes:rcln:cluster_tal:fred [2020/09/23 14:36] (Version actuelle)
@@ Ligne 308: / Ligne 308: @@
 </code>
 ===== Boxer and statistical models =====
-Following [http://web.archive.org/web/20160313031620/http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Installation the documentation (Step 6)], we just go to the ''candc'' directory and make...
+Following [[http://web.archive.org/web/20160313031620/http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Installation | the documentation (Step 6)]], we just go to the //candc// directory and make...
+<code>
  $ cd /opt/FRED/BoxerServer/cand
  $ make bin/boxer
@@ Ligne 329: / Ligne 330: @@
  % Autoloader: iteration 2 resolved 21 predicates and loaded 28 files in 0,074 seconds.  Restarting ...
  % Autoloader: loaded 33 files in 3 iterations in 0,205 seconds
+</code>
 Finally, we just check that the statistical models are there:
+<code>
  $ ls models/
  boxer  chunk_quotes  muc  noquotes  pos           pos_questions  questions  super_noquotes   super_quotes
  chunk  config        ner  parser    pos_noquotes  pos_quotes     super      super_questions  verbstem.list
+</code>
-And do some testing from [http://web.archive.org/web/20150304100912/http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Examples C&C examples page]
+And do some testing from [[http://web.archive.org/web/20150304100912/http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Examples | C&C examples page]]
+<code>
  $ bin/candc --models models
  # this file was generated by the following command(s):
@@ Ligne 342: / Ligne 347: @@
  # this file was generated by the following command(s):
  #   bin/candc --models models
+</code>
 You have to type a sentence (like 'I would like to go'):
+<code>
  I would like to go
 parsed at B=0.075, K=20
@@ Ligne 352: / Ligne 359: @@
  (ncsubj go_4 I_0 _)
  <c> I|I|PRP|I-NP|O|NP would|would|MD|I-VP|O|(S[dcl]\NP)/(S[b]\NP) like|like|VB|I-VP|O|(S[b]\NP)/(S[to]\NP) to|to|TO|I-VP|O|(S[to]\NP)/(S[b]\NP) go|go|VB|I-VP|O|S[b]\NP
+</code>
+<code>
 stats 0.693147 25 25 comb 20 13 0 0
+</code>
-===Stanford Core NLP v.3.4.1===
+==== Stanford Core NLP v.3.4.1 ====
-FRED works only with Core NLP is 3.4.1, so we should go to [https://stanfordnlp.github.io/CoreNLP/history.html Stanford Core NLP release history page] in order to downloado this specific version.
+FRED works only with Core NLP is 3.4.1, so we should go to [[https://stanfordnlp.github.io/CoreNLP/history.html | Stanford Core NLP release history page]] in order to download this specific version.
+<code>
  $ cd /opt/FRED/externals/tgz
  $ wget http://nlp.stanford.edu/software/stanford-corenlp-full-2014-08-27.zip
  $ cd ..
  $ unzip tgz/stanford-corenlp-full-2014-08-27.zip
+</code>
-Now we follow the "[https://stanfordnlp.github.io/CoreNLP/cmdline.html Using Stanford CoreNLP from the command line]" documentation page. So we go to Core NLP root directory and run...
+Now we follow the "[[https://stanfordnlp.github.io/CoreNLP/cmdline.html | Using Stanford CoreNLP from the command line]]" documentation page. So we go to Core NLP root directory and run...
+<code>
  $ java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt
@@ Ligne 407: / Ligne 416: @@
  Pipeline setup: 0,0 sec.
  Total time for StanfordCoreNLP pipeline: 1,6 sec.
+</code>
 According to the documentation, this command process a file called ''input.txt'' and produces an ''input.txt.xml'' file with POS, named entites and lemma annotation. There's some configuration to do (classpath, properties file) but we will wait until we know how exactly FRED uses Core NLP for further configuration.
-===Python interface to Stanford Core NLP tools v3.4.1===
+==== Python interface to Stanford Core NLP tools v3.4.1 ====
-So we go back to the /opt/FRED/externals directory and clone [https://github.com/dasmith/stanford-corenlp-python.git Stanford Core NLP Python wrapper]
+So we go back to the /opt/FRED/externals directory and clone [[https://github.com/dasmith/stanford-corenlp-python.git | Stanford Core NLP Python wrapper]]
+<code>
  $ cd /opt/FRED/externals
  $ git clone https://github.com/dasmith/stanford-corenlp-python.git
+</code>
 We check python version and install pip and the wrapper dependencies:
+<code>
  $ python --version
  Python 2.7.6
  $ sudo apt-get install python-pip
  $ sudo pip install pexpect unidecode
+</code>
-The we follow [https://github.com/dasmith/stanford-corenlp-python/blob/master/README.md the python wrapper documentation], which specifies that Stanford Core NLP must be a child directory of the python wrapper, so we move our Core NLP directory inside the wrapper's directory:
+The we follow [[https://github.com/dasmith/stanford-corenlp-python/blob/master/README.md | the python wrapper documentation]], which specifies that Stanford Core NLP must be a child directory of the python wrapper, so we move our Core NLP directory inside the wrapper's directory:
+<code>
  $ pwd
  /opt/FRED/externals
@@ Ligne 428: / Ligne 443: @@
  $ mv stanford-corenlp-full-2014-08-27/ stanford-corenlp-python/
  $ ln -s stanford-corenlp-python/stanford-corenlp-full-2014-08-27/ stanford-corenlp
+</code>
 Then we launch the wrapper's server
+<code>
  $ python corenlp.py
  Loading Models: 5/5
  INFO:__main__:Serving on http://127.0.0.1:8080
+</code>
 There's a client.py program for testing the wrapper:
+<code>
  $ python client.py
  {u'sentences': [{u'parsetree': u'(ROOT (S (VP (NP (INTJ (UH Hello)) (NP (NN world)))) (. !)))',
@@ Ligne 452: / Ligne 471: @@
       from nltk.tree import Tree
  ImportError: No module named nltk.tree
+</code>
-So we must install [http://www.nltk.org/install.html NLTK] because it looks like a dependecy for the wrapper:
+So we must install [[http://www.nltk.org/install.html | NLTK]] because it looks like a dependecy for the wrapper:
+<code>
  $ sudo pip install -U nltk
  $ python
  >>> import nltk
  $ sudo python -m nltk.downloader -d /usr/local/share/nltk_data all
+</code>
 We test again
+<code>
  $ python client.py
  Traceback (most recent call last):
    File "client.py", line 18, in <module>
       tree = Tree.parse(result['sentences'][0]['parsetree'])
+</code>
 We still have an error, but it doesn't look bad, so we're going to ignore it and move on.
-===Babelfly===
+==== Babelfly ====
-I can't find any reference to entity disambiguation with Babelfly in FRED code, so I wont proceed to the installation from the [http://babelfy.org/download Babelfly download page]. Maybe it's a TODO to replace the Tagme calls (which are still inside FRED code) for Babelfly calls.
+I can't find any reference to entity disambiguation with Babelfly in FRED code, so I wont proceed to the installation from the [[http://babelfy.org/download Babelfly | download page]]. Maybe it's a TODO to replace the Tagme calls (which are still inside FRED code) for Babelfly calls.
+<code>
  $ find . -name "*.py" -exec grep -Hn agme {} \;
  ./fred-corenlp/server-fred-paris.py:139:        tagmeEntities = {}
@@ Ligne 486: / Ligne 511: @@
  $ find . -name "*.py" -exec grep -Hn abelfly {} \;
  [ ]
+</code>
-==Configuration==
+===== Configuration =====
 First we will go to fred-corenlp directory
+<code>
  $ cd /opt/FRED/fred-corenlp
+</code>
-There, we will edit the ''config.py'' file to add ''candc'' path in line 5
+There, we will edit the ''config.py'' file to add //candc// path in line 5
- CANDC_BIN_PATH = '/opt/FRED/BoxerServer/candc'
+ ''CANDC_BIN_PATH = /opt/FRED/BoxerServer/candc''
-...and line 159 with the right ''nltk_data'' path
+...and line 159 with the right //nltk_data// path
- NLTK_PATH = '/usr/local/share/nltk_data'
+<code> NLTK_PATH = '/usr/local/share/nltk_data' </code>
 Then we go back to FRED root to edit Boxer's files
+<code>
   $ cd ..
   $ emacs -nw localboxerclient localboxerserver
+</code>
-In both files we set ''candc'' root:
+In both files we set //candc// root:
+<code>
  PREFIX=/opt/FRED/BoxerServer/candc
+</code>
-''localboxerserver'' should look like:
+//localboxerserver// should look like:
+<code>
  !/bin/bash
  PREFIX=/opt/FRED/BoxerServer/candc
  $PREFIX/bin/soap_server --server localhost:9000 --models $PREFIX/models/boxer --candc-printer boxer --candc-int-betas "0 0 0 0 0"
+</code>
-==Testing==
+===== Testing =====
 We first go to FRED root
+<code>
  $ cd /opt/FRED
+</code>
 And launch the boxer server
- $ sh launchboxerserver
+<code> $ sh launchboxerserver </code>
 We get a permission error, so we add execution attribute for both files
+<code>
  $ sudo chmod a+x BoxerServer/candc/bin/soap_client
  $ sudo chmod a+x BoxerServer/candc/bin/soap_server
+</code>
 And we get more errors:
+<code>
  $ sh launchboxerserver
  /opt/FRED/BoxerServer/candc/bin/soap_server: 1: /opt/FRED/BoxerServer/candc/bin/soap_server: Syntax error: "(" unexpected
@@ Ligne 529: / Ligne 566: @@
  /opt/FRED/BoxerServer/candc/bin/soap_client: 1: /opt/FRED/BoxerServer/candc/bin/soap_client: Syntax error: word unexpected (expecting ")")
  ERROR: file /tmp/boxer.ccg does not exist
+</code>
-''TODO'': recompile soap clients and server paying attention to parenthesis...
+**TODO**: recompile soap clients and server paying attention to parenthesis...