You are here
A project labeled by the competitivity cluster Systematic and funded by the Fonds Unique Interministériel.
A project funded by the Fonds Unique Interministériel (FUI) and labeled by Cap Digital, the French competitivity cluster for digital knowledge.
A project funded by the 7th Framework Programme of the European Union "Digital Libraries".
A project funded by the French Agence Nationale de la Recherche (ANR)
New words and new uses are being created constantly. How to detect and classify an unknown word or a new proper name in a text or in a flow of words? How to assign a phonetic category, syntactic properties or a place in a semantic network? To answer these questions, the EDylex project sets its goal to experiment on the contents of Agence France Presse all opportunities of dynamic enrichment of lexicons offered by the tools of Natural Language Processing. With a daily production of 5,000 wire stories in six languages (English, French, Spanish, German, Portuguese and Arabic), AFP is the ideal ground to test solutions of multimodal and multilingual linguistic analysis capable of enriching dynamically its own language models and lexicons.
The EDyLex consortium is supported by the INRIA Alpage Paris 7 team, specializing in the development of written text analyzers and related resources. It includes two major laboratories, LIF and LIMSI, both comprising computational linguists specializing in the treatment of written and oral language and three companies, Vocapia Research (Industrial Research in Speech Processing), Syllabs(Engineering Languages for ICT) and Agence France-Presse, partner user.
The main purpose of the processes and tools developed within the framework of EDyLex is to improve the semi-automatic annotation of documents and transcription of video soundtracks.