−
Table des matières
Neoveille
Objectifs
Consortium
Embauche d'un ingénieur d'études
ToDo
Schedule
Iteration 0: seven languages with POS tagging on IMS CWB
Iteration 1: seven languages with POS tagging and dependency analysis on D3 or IMS CWB
Iteration 2: neologism detection
Questions
Liens
Neoveille
Repérage, analyse et suivi des néologisme en corpus
Objectifs
Plateforme de repérage, analyse et suivi des néologismes en corpus (LIPN)
Etude des emprunts en corpus (LDI, CLILLAC-ARP, Ieda, EMPNEO)
Etude de la néologie sémantique (ERTIM, LIPN, LDI)
Consortium
Paris 13 (LIPN, LDI)
Paris 7 (CLILLAC-ARP)
INALCO (ERTIM)
Université Sao Paulo (Ieda Alves)
Groupe EMPNEO
Embauche d'un ingénieur d'études
Fiche de poste: ingénieur d'étude pour le projet projet Neoveille
ToDo
Embaucher un ingénieur pour développer la plate-forme (DONE)
Choisir une architecture approprié au besoin et au caractère multilingue de l'application (DONE)
POS Tagging
TAL server is not displaying Chinese nor Russian characters
(DONE)
Greek POS tagging web service (DONE)
We explore
Tokenization problem for Tree Tagger (DONE)
Tree tagger installation (Katia:DONE)
Emmanuel will perform tests on the TAL server installation, specially of the POS tagging part (Emmanuel)
Indexing
IMS CWB web interface and Tree Tagger (Emmanuel, Katia, Jorge;
Due date: November 13
)
Katia will install IMS CWB in her computer from scratch (Katia and Jorge)
After a localhost connection is possible, we would index a corpus from Neoveille and test it (Katia & Jorge)
Fix or reinstalls CPQ Web in the TAL Server (Katia, Jorge, Emmanuel)
Infrastructure
Redmine migration (Jorge)
Gibhub for Neoveille (Jorge)
Document TAL Cluster (Emmanuel)
Project web site (Katia)
Python migration (Katia)
Iteration 1
Which architecture for Neoveille's functional interface (Emmanuel, Jorge, Katia)
Next meeting: Friday, November 17th, 14h
Schedule
Iteration 0: seven languages with POS tagging on IMS CWB
Scheduled date: November 5
RSS processing
POS Tagging
Milestone: produce the same output for the 7 language in the TAL server every month from the RSS input
POS Tagging in the seven language
Greek
Chinese
Russian
Portuguese
Polish
Czech
French
POS Tagging with the RSS input
Indexing of the POS Tagging RSS output for IMS CWB
Dependency Analysis
Neoveille Web interface
Is
D3 suitable for our project
?
Project web site
Iteration 1: seven languages with POS tagging and dependency analysis on D3 or IMS CWB
Iteration 2: neologism detection
Questions
Est-ce qu'on peut distribué en libre des données qu'on au recueilli à partir d'un fil RSS?
Liens
Sketch-engine
No-sketch (open source version of Sketch