JRC опубликовала большие наборы мультиязычных данных и инструменты для работы с ними

The highly multilingual Europe Media
Monitor (EMM) applications are now available as free Apps for mobile devices
running iOS and Android.

You can download the EMM Apps via the links at the EMM-NewsBrief entry page
http://emm.newsbrief.eu/ . Apps are currently available for mobile devices
running Android and for the iPad. A version for the iPhone will be released
soon.

We would be delighted to receive your feedback on the newly released EMM
Apps to the email address emm AT jrc DOT it.

We look forward to hearing from you.

EUROPE MEDIA MONITOR  (EMM)

EMM gathers a daily average of 175,000 news items in over 70 languages and
it analyses the news automatically using a wide range of JRC-developed
computational linguistics tools. These include event extraction; automatic
entity recognition, classification and disambiguation; name variant mapping;
co-reference resolution; quotation recognition; opinion mining;
multi-document summarisation; document clustering and classification;
machine translation; information aggregation, including across languages;
and more.

BACKGROUND AND READING

EMM <http://emm.newsbrief.eu/overview.html>  is a freely available
advert-free family of news monitoring and analysis applications developed by
the OPTIMA Team <http://ipsc.jrc.ec.europa.eu/?id=178>  at the Joint
Research Centre <http://ec.europa.eu/dgs/jrc/>  (JRC), which is the European
Commission’s in-house science service.

You find literature (scientific publications) on the Europe Media Monitor
EMM <http://langtech.jrc.ec.europa.eu/JRC_Publications.html>  and its
individual text mining tools at the URL
http://langtech.jrc.ec.europa.eu/JRC_Publications.html, including papers
giving an introduction and a generic overview of the Europe Media Monitor
EMM
<http://langtech.jrc.ec.europa.eu/Documents/09_SIGIR-WS_Steinberger+frontmat
ter.pdf
> .

FREELY AVAILABLE LINGUISTIC RESOURCES

The JRC has also released a large volume of freely available multilingual
linguistic resources <http://ipsc.jrc.ec.europa.eu/index.php?id=61>  that
can be used to develop or test a variety of multilingual and cross-lingual
Natural Language Processing tools. These include parallel corpora in up to
26 languages; readily trained automatic document categorisation software
<http://ipsc.jrc.ec.europa.eu/index.php?id=60>  in 22 languages;
dictionaries of names and their variant spellings
<http://ipsc.jrc.ec.europa.eu/?id=42> , including across languages and
scripts; and more. See http://ipsc.jrc.ec.europa.eu/index.php?id=61 for
details and to download the data and the tools.

Об авторе Лидия Пивоварова

СПбГУ - старший преподаватель, University of Helsinki - PhD student http://philarts.spbu.ru/structure/sub-faculties/itah_phil/teachers/pivovarova
Запись опубликована в рубрике Ресурсы/Софт. Добавьте в закладки постоянную ссылку.

Добавить комментарий

Ваш e-mail не будет опубликован. Обязательные поля помечены *