| Les deux révisions précédentes
Révision précédente
Prochaine révision
|
Révision précédente
|
equipes:rcln:cluster_tal:fred [2019/04/03 12:14] rosse [Boxer and statistical models] |
equipes:rcln:cluster_tal:fred [2020/09/23 14:36] (Version actuelle) |
| </code> | </code> |
| ==== Stanford Core NLP v.3.4.1 ==== | ==== Stanford Core NLP v.3.4.1 ==== |
| FRED works only with Core NLP is 3.4.1, so we should go to [https://stanfordnlp.github.io/CoreNLP/history.html Stanford Core NLP release history page] in order to downloado this specific version. | FRED works only with Core NLP is 3.4.1, so we should go to [[https://stanfordnlp.github.io/CoreNLP/history.html | Stanford Core NLP release history page]] in order to download this specific version. |
| | <code> |
| $ cd /opt/FRED/externals/tgz | $ cd /opt/FRED/externals/tgz |
| $ wget http://nlp.stanford.edu/software/stanford-corenlp-full-2014-08-27.zip | $ wget http://nlp.stanford.edu/software/stanford-corenlp-full-2014-08-27.zip |
| $ cd .. | $ cd .. |
| $ unzip tgz/stanford-corenlp-full-2014-08-27.zip | $ unzip tgz/stanford-corenlp-full-2014-08-27.zip |
| | </code> |
| |
| Now we follow the "[https://stanfordnlp.github.io/CoreNLP/cmdline.html Using Stanford CoreNLP from the command line]" documentation page. So we go to Core NLP root directory and run... | Now we follow the "[[https://stanfordnlp.github.io/CoreNLP/cmdline.html | Using Stanford CoreNLP from the command line]]" documentation page. So we go to Core NLP root directory and run... |
| | <code> |
| $ java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt | $ java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt |
| |
| Pipeline setup: 0,0 sec. | Pipeline setup: 0,0 sec. |
| Total time for StanfordCoreNLP pipeline: 1,6 sec. | Total time for StanfordCoreNLP pipeline: 1,6 sec. |
| | </code> |
| |
| According to the documentation, this command process a file called ''input.txt'' and produces an ''input.txt.xml'' file with POS, named entites and lemma annotation. There's some configuration to do (classpath, properties file) but we will wait until we know how exactly FRED uses Core NLP for further configuration. | According to the documentation, this command process a file called ''input.txt'' and produces an ''input.txt.xml'' file with POS, named entites and lemma annotation. There's some configuration to do (classpath, properties file) but we will wait until we know how exactly FRED uses Core NLP for further configuration. |
| |
| ==== Python interface to Stanford Core NLP tools v3.4.1 ==== | ==== Python interface to Stanford Core NLP tools v3.4.1 ==== |
| So we go back to the /opt/FRED/externals directory and clone [https://github.com/dasmith/stanford-corenlp-python.git Stanford Core NLP Python wrapper] | So we go back to the /opt/FRED/externals directory and clone [[https://github.com/dasmith/stanford-corenlp-python.git | Stanford Core NLP Python wrapper]] |
| | <code> |
| $ cd /opt/FRED/externals | $ cd /opt/FRED/externals |
| $ git clone https://github.com/dasmith/stanford-corenlp-python.git | $ git clone https://github.com/dasmith/stanford-corenlp-python.git |
| | </code> |
| |
| We check python version and install pip and the wrapper dependencies: | We check python version and install pip and the wrapper dependencies: |
| | <code> |
| $ python --version | $ python --version |
| Python 2.7.6 | Python 2.7.6 |
| $ sudo apt-get install python-pip | $ sudo apt-get install python-pip |
| $ sudo pip install pexpect unidecode | $ sudo pip install pexpect unidecode |
| | </code> |
| |
| The we follow [https://github.com/dasmith/stanford-corenlp-python/blob/master/README.md the python wrapper documentation], which specifies that Stanford Core NLP must be a child directory of the python wrapper, so we move our Core NLP directory inside the wrapper's directory: | The we follow [[https://github.com/dasmith/stanford-corenlp-python/blob/master/README.md | the python wrapper documentation]], which specifies that Stanford Core NLP must be a child directory of the python wrapper, so we move our Core NLP directory inside the wrapper's directory: |
| | <code> |
| $ pwd | $ pwd |
| /opt/FRED/externals | /opt/FRED/externals |
| $ mv stanford-corenlp-full-2014-08-27/ stanford-corenlp-python/ | $ mv stanford-corenlp-full-2014-08-27/ stanford-corenlp-python/ |
| $ ln -s stanford-corenlp-python/stanford-corenlp-full-2014-08-27/ stanford-corenlp | $ ln -s stanford-corenlp-python/stanford-corenlp-full-2014-08-27/ stanford-corenlp |
| | </code> |
| |
| Then we launch the wrapper's server | Then we launch the wrapper's server |
| | <code> |
| $ python corenlp.py | $ python corenlp.py |
| |
| Loading Models: 5/5 | Loading Models: 5/5 |
| INFO:__main__:Serving on http://127.0.0.1:8080 | INFO:__main__:Serving on http://127.0.0.1:8080 |
| | </code> |
| |
| There's a client.py program for testing the wrapper: | There's a client.py program for testing the wrapper: |
| | <code> |
| $ python client.py | $ python client.py |
| {u'sentences': [{u'parsetree': u'(ROOT (S (VP (NP (INTJ (UH Hello)) (NP (NN world)))) (. !)))', | {u'sentences': [{u'parsetree': u'(ROOT (S (VP (NP (INTJ (UH Hello)) (NP (NN world)))) (. !)))', |
| from nltk.tree import Tree | from nltk.tree import Tree |
| ImportError: No module named nltk.tree | ImportError: No module named nltk.tree |
| | </code> |
| |
| So we must install [http://www.nltk.org/install.html NLTK] because it looks like a dependecy for the wrapper: | So we must install [[http://www.nltk.org/install.html | NLTK]] because it looks like a dependecy for the wrapper: |
| | <code> |
| $ sudo pip install -U nltk | $ sudo pip install -U nltk |
| $ python | $ python |
| >>> import nltk | >>> import nltk |
| $ sudo python -m nltk.downloader -d /usr/local/share/nltk_data all | $ sudo python -m nltk.downloader -d /usr/local/share/nltk_data all |
| | </code> |
| |
| We test again | We test again |
| | <code> |
| $ python client.py | $ python client.py |
| Traceback (most recent call last): | Traceback (most recent call last): |
| File "client.py", line 18, in <module> | File "client.py", line 18, in <module> |
| tree = Tree.parse(result['sentences'][0]['parsetree']) | tree = Tree.parse(result['sentences'][0]['parsetree']) |
| | </code> |
| |
| We still have an error, but it doesn't look bad, so we're going to ignore it and move on. | We still have an error, but it doesn't look bad, so we're going to ignore it and move on. |
| |
| ==== Babelfly ==== | ==== Babelfly ==== |
| I can't find any reference to entity disambiguation with Babelfly in FRED code, so I wont proceed to the installation from the [http://babelfy.org/download Babelfly download page]. Maybe it's a TODO to replace the Tagme calls (which are still inside FRED code) for Babelfly calls. | I can't find any reference to entity disambiguation with Babelfly in FRED code, so I wont proceed to the installation from the [[http://babelfy.org/download Babelfly | download page]]. Maybe it's a TODO to replace the Tagme calls (which are still inside FRED code) for Babelfly calls. |
| | <code> |
| $ find . -name "*.py" -exec grep -Hn agme {} \; | $ find . -name "*.py" -exec grep -Hn agme {} \; |
| ./fred-corenlp/server-fred-paris.py:139: tagmeEntities = {} | ./fred-corenlp/server-fred-paris.py:139: tagmeEntities = {} |
| $ find . -name "*.py" -exec grep -Hn abelfly {} \; | $ find . -name "*.py" -exec grep -Hn abelfly {} \; |
| [ ] | [ ] |
| | </code> |
| ===== Configuration ===== | ===== Configuration ===== |
| First we will go to fred-corenlp directory | First we will go to fred-corenlp directory |
| | <code> |
| $ cd /opt/FRED/fred-corenlp | $ cd /opt/FRED/fred-corenlp |
| | </code> |
| |
| There, we will edit the ''config.py'' file to add ''candc'' path in line 5 | There, we will edit the ''config.py'' file to add //candc// path in line 5 |
| CANDC_BIN_PATH = '/opt/FRED/BoxerServer/candc' | ''CANDC_BIN_PATH = /opt/FRED/BoxerServer/candc'' |
| |
| ...and line 159 with the right ''nltk_data'' path | ...and line 159 with the right //nltk_data// path |
| |
| NLTK_PATH = '/usr/local/share/nltk_data' | <code> NLTK_PATH = '/usr/local/share/nltk_data' </code> |
| |
| Then we go back to FRED root to edit Boxer's files | Then we go back to FRED root to edit Boxer's files |
| | <code> |
| $ cd .. | $ cd .. |
| $ emacs -nw localboxerclient localboxerserver | $ emacs -nw localboxerclient localboxerserver |
| | </code> |
| |
| In both files we set ''candc'' root: | In both files we set //candc// root: |
| | <code> |
| PREFIX=/opt/FRED/BoxerServer/candc | PREFIX=/opt/FRED/BoxerServer/candc |
| | </code> |
| |
| ''localboxerserver'' should look like: | //localboxerserver// should look like: |
| | <code> |
| !/bin/bash | !/bin/bash |
| PREFIX=/opt/FRED/BoxerServer/candc | PREFIX=/opt/FRED/BoxerServer/candc |
| $PREFIX/bin/soap_server --server localhost:9000 --models $PREFIX/models/boxer --candc-printer boxer --candc-int-betas "0 0 0 0 0" | $PREFIX/bin/soap_server --server localhost:9000 --models $PREFIX/models/boxer --candc-printer boxer --candc-int-betas "0 0 0 0 0" |
| | </code> |
| ===== Testing ===== | ===== Testing ===== |
| We first go to FRED root | We first go to FRED root |
| | <code> |
| $ cd /opt/FRED | $ cd /opt/FRED |
| | </code> |
| |
| And launch the boxer server | And launch the boxer server |
| $ sh launchboxerserver | <code> $ sh launchboxerserver </code> |
| |
| We get a permission error, so we add execution attribute for both files | We get a permission error, so we add execution attribute for both files |
| | <code> |
| $ sudo chmod a+x BoxerServer/candc/bin/soap_client | $ sudo chmod a+x BoxerServer/candc/bin/soap_client |
| $ sudo chmod a+x BoxerServer/candc/bin/soap_server | $ sudo chmod a+x BoxerServer/candc/bin/soap_server |
| | </code> |
| |
| And we get more errors: | And we get more errors: |
| | <code> |
| $ sh launchboxerserver | $ sh launchboxerserver |
| /opt/FRED/BoxerServer/candc/bin/soap_server: 1: /opt/FRED/BoxerServer/candc/bin/soap_server: Syntax error: "(" unexpected | /opt/FRED/BoxerServer/candc/bin/soap_server: 1: /opt/FRED/BoxerServer/candc/bin/soap_server: Syntax error: "(" unexpected |
| /opt/FRED/BoxerServer/candc/bin/soap_client: 1: /opt/FRED/BoxerServer/candc/bin/soap_client: Syntax error: word unexpected (expecting ")") | /opt/FRED/BoxerServer/candc/bin/soap_client: 1: /opt/FRED/BoxerServer/candc/bin/soap_client: Syntax error: word unexpected (expecting ")") |
| ERROR: file /tmp/boxer.ccg does not exist | ERROR: file /tmp/boxer.ccg does not exist |
| | </code> |
| |
| ''TODO'': recompile soap clients and server paying attention to parenthesis... | **TODO**: recompile soap clients and server paying attention to parenthesis... |
| |