JOB: программист для ОРФО

Компания «Информатик», 20 лет назад (даже уже чуть больше) выпустившая лучший русский спелл-чекер ОРФО, ищет программиста C++, чтобы довести ОРФО до состояния «лучший в мире».

Очень желательны опыт работы с лингвистическими задачами и/или лингвистическое образование.

Предстоит а) разобраться с существующим кодом и поддерживать его (и даже отчасти развивать) некоторое время, б) постепенно всё переписать с нуля, в) написать новый грамматический корректор.

Среди задач, которые придется решать, — взаимодействие с операционными системами (включая MacOS) и приложениями (MS Office и др.).

Вопросы и резюме — Михаилу Воловичу, mv[цобачка]

Рубрика: Вакансии/Стажировки | Добавить комментарий

30th European Summer School in Logic, Language and Information ESSLLI 2018

Sofia University «St. Kl. Ohridski»
August 6-17, 2018

The 30th edition of ESSLLI (European Summer School in Logic, Language and
Information) will take place from 6 August to 17 August 2018 at Sofia
University ?St. Kl. Ohridski?, Sofia, Bulgaria. The European Summer School
in Logic, Language and Information is an event organized every year in a
different European country under the auspices of the Association for Logic,
Language and Information (FoLLI).

Sofia University ?St. Kl. Ohridski? and the Institute of Information and
Communication Technologies, Bulgarian Academy of Sciences (IICT-BAS), will
jointly host ESSLLI 2018.

ESSLLI 2018 will be held under the patronage of Mrs. Yordanka Fandakova,
Mayor of Sofia Capital Municipality.

We are pleased to announce that the program schedule is now available for
the two weeks of the school.
Under ‘Program’ section you can also check the information about the
satellite Formal Grammar Conference as well as the Student Session.

The ESSLLI 2018 Organization Team

Рубрика: Курсы/Образование/Постдоки, Лекции/Семинары | Добавить комментарий

Special Issue on Semantic Deep Learning at the Semantic Web Journal

For more details please visit:

Semantic Web technologies and deep learning share the goal of creating intelligent artifacts that emulate human capacities such as reasoning, validating, and predicting. Both fields have been impacting data and knowledge analysis considerably as well as their associated abstract representations. Deep learning is a term used to refer to deep neural network algorithms that learn data representations by means of transformations with multiple processing layers. These architectures have frequently been applied in NLP to feature learning from raw data, such as part-of-speech-tagging, morphological tagging, language modeling, and so forth. Semantic Web technologies and knowledge representation, on the other hand, boost the re-use and sharing of knowledge in a structured and machine readable fashion. Semantic resources such as WikiData, Yago, BabelNet or DBpedia, as well as knowledge base construction and completion methods have been successfully applied to improved systems addressing semantically intensive tasks (e.g. Question Answering).

Topics include, but are not limited to:

Structured knowledge in deep learning:
— learning and applying knowledge graph embeddings
— applications of knowledge-rich embeddings
— neural networks and logic rules
— learning semantic similarity and encoding distances as knowledge graph
— ontology-based text classification
— multilingual resources for neural representations of linguistics
— semantic role labeling

Deep reasoning and inferences:
— commonsense reasoning and vector space models
— reasoning with deep learning methods
Learning knowledge representations with deep learning:
— word embeddings for ontology matching and alignment
— deep learning and semantic web technologies for specialized domains
— deep learning ontologies
— deep learning models for learning knowledge representations from text
— deep learning ontological annotations

Joint tasks:
-mining multilingual natural language for SPARQL queries
-information retrieval and extraction with knowledge graphs and deep learning models
-knowledge-based deep word sense disambiguation and entity linking
-investigation of compatibilities and incompatibilities between deep learning and Semantic

Web approaches:
-neural networks for learning Linked Data

Submission deadline: 28 February 2018. Papers submitted before the deadline will be reviewed upon receipt.

Submission Instructions:

Guest Editors:
The guest editors can be reached at
Luis Espinosa Anke, Cardiff University, UK
Thierry Declerck, DFKI GmbH, Germany
Dagmar Gromann, Technical University Dresden, Germany

Рубрика: Журналы | Добавить комментарий

HCOMP 2018 — Crowdsourcing & Human Computation

The 2018 AAAI Conference on Crowdsourcing and Human Computation (HCOMP) will be held July 5-8, 2018 in Zurich, Switzerland. Follow us on Twitter!


Abstracts for full papers are due on February 23, 2018. (Abstracts are due a week earlier, on February 19.) See «Call for Full Papers» below for details.

Important Dates


FEBRUARY 19, 2018: Abstracts submission

FEBRUARY 23, 2018: Full papers due

MARCH 23–24, 2018: PC meeting in US and UK

MARCH 29, 2018: Notification of acceptance

APRIL 6, 2018: Workshop proposals due

MAY 1, 2018: Camera ready papers due

JULY 5–8, 2018: Conference



HCOMP is the premier venue for disseminating the latest research findings on crowdsourcing and human computation. While artificial intelligence (AI) and human-computer interaction (HCI) represent traditional mainstays of the conference, HCOMP believes strongly in inviting, fostering, and promoting broad, interdisciplinary research. This field is particularly unique in the diversity of disciplines it draws upon, and contributes to, ranging from human-centered qualitative studies and HCI design, to computer science and artificial intelligence, economics and the social sciences, all the way to digital humanities, policy, and ethics. We promote the exchange of advances in human computation and crowdsourcing not only among researchers, but also engineers and practitioners, to encourage dialogue across disciplines and communities of practice.

HCOMP 2018 builds on a successful history of past meetings: five HCOMP conferences (2013–2017) and four earlier workshops, held at the AAAI Conference on Artificial Intelligence (2011–2012), and the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2009–2010). Proceedings from past HCOMP conferences are available online in the HCOMP Conference Digital Archive.

Читать далее

Рубрика: Конференции | Добавить комментарий

Third Workshop on Chatbots and Conversational Agent Technologies

Call for papers

Third Workshop on Chatbots and Conversational Agent Technologies (WOCHAT 2018)

Special Session at IWSDS2018, Singapore, May 14-16, 2018


Although chat-oriented dialogue systems have been around for many years, they have been recently gaining a lot of popularity in both research and commercial arenas. From the commercial stand point, chat-oriented dialogue seems to be providing an excellent means to engage users for entertainment purposes, as well as to give a more human-like appearance to established vertical goal-oriented dialogue systems.

This workshop invites original research contributions on all aspects of chat-oriented dialogue, including closely related areas such as knowledge representation and reasoning, language generation, and natural language understanding, among others. In this sense the workshop will invite for both long and short paper submissions in areas including (but not restricted to):

* Chat-oriented dialogue systems

* Data collections and resources

* Information extraction

* Natural language understanding

* General domain knowledge representation

* Common sense and reasoning

* Natural language generation

* Emotion detection and generation

* Sense of humour detection and generation

* Chat-oriented dialogue evaluation

* User studies and system evaluation

* Multimodal human-computer interaction

Paper Submissions:

Prospective authors are invited to follow the IWSDS 2018 paper preparation guidelines ( Paper submissions must be done in electronic format through the IWSDS 2018 conference submission page (, where you must select «WOCHAT» under the available submission categories.

Important Dates:

Workshop paper submission deadline: January 14, 2018

Workshop paper notification deadline: February 18, 2018

Shared task report submission deadline: January 21, 2018

Shared task report notification deadline: February 18, 2018

Workshop Organizers:

Ryuichiro Higashinaka – Nippon Telegraph and Telephone Corporation (Japan)

Ron Artstein – University of Southern California (USA)

Rafael E. Banchs – Institute for Infocomm Research (Singapore)

Wolfgang Minker – Ulm University (Germany)

Verena Rieser – Heriot-Watt University (UK)

Shared Task Organizers:

Bayan Abu Shawar – Arab Open University (Jordan)

Luis Fernando D’Haro – Agency for Science, Technology and Research (Singapore)

Zhou Yu – University of California, Davis (USA)

Steering Committee:

David Traum – University of Southern California (USA)

Joseph Mariani – LIMSI-CNRS (France)

Alexander Rudnicky – Carnegie Mellon University (USA)

Haizhou Li – National University of Singapore (Singapore)

Рубрика: Конференции | Добавить комментарий

Shared Tasks on translation, MT evaluation, and automated post-editing.


October 31 — November 1, 2018, in conjunction with EMNLP 2018 in
Brussels, Belgium

As part of WMT, as in previous years, we will be organising a collection
of shared tasks related to machine translation.  We hope that both
beginners and established research groups will participate. This year we
have so far confirmed the following tasks

— Translation tasks
— News
— Biomedical
— Multimodal
— Evaluation tasks
— Metrics
— Quality estimation
— Other tasks
— Automatic post-editing

Further information, including task rationale, timetables and data will
be posted on the WMT18 website ( in due
course. Tasks will be launched in January/February with test weeks in

Intending participants are encouraged to register with the mailing list
for further announcements

For all tasks,  participants will also be  invited to submit a short
paper describing their system.

Рубрика: Без рубрики | Добавить комментарий

European Masters Program in Language and Communication Technologies (LCT)

invites applications from both European and non-European students for start in fall 2018.

Key facts:
+ duration 2 years (120 ECTS credits)
+ in-depth instruction in computational linguistics methods and technologies
+ study one year each at two different partner universities in Europe
+ double degree
+ possibility to visit one of two non-European partners for a part of the study
+ courses and academic and administrative support in English

Deadline 15.02.2018
(Late applications from European candidates will be accepted on a rolling basis until May 15, 2018.)

Program fees: 4250 Euro for students from the EU; 8500 Euro for non-EU students (per academic year). Please check the program website on tips about funding possibilities.

The LCT program is offered by the following consortium of Universities:

European partners:
1. Saarland University in Saarbruecken, Germany (coordinator)
2. University of Trento, Trento, Italy
3. University of Malta, Malta
4. University of Lorraine, Nancy, France
5. Charles University, Prague, Czech Republic
6. Rijksuniversiteit Groningen, The Netherlands
7. The University of the Basque Country / Euskal Herriko University, San Sebastian, Spain

Non-European partners:
8. Shanghai Jiao Tong University, China
9. The University of Melbourne, Australia

The LCT Program in Brief

The European Masters Program in Language and Communication Technologies (LCT) program is an international distributed Master of Science course. It is designed to meet the demands of industry and research in the rapidly growing of language technology. It offers education and training opportunities for the next generations of leaders in research and innovation. It provides students with profound knowledge and insight into the various disciplines that contribute to the methods of language and communication technologies a.k..a computational linguistics a.k.a. natural language processing and strengthens their ability to work according to scientific methods. Moreover, the students acquire practice-oriented knowledge by choosing appropriate combinations of modules in Language Technology, Computational and Theoretical Linguistics, and Computer Science.

The program involves studying one year each at two different European partner universities. Optionally a stay at one of the non-European partners for some months is possible. After completing all study requirements, the students obtain two Master of Science/Arts degrees: one from each European universities where they studied.

The course consists of compulsory core modules, as well as elective advanced modules in Language Technology and Computer Science, possibly complemented by an internship project, and completed by a Master Thesis.

The LCT Masters program has been successfully implemented since 2006, being funded by the Erasmus Mundus Programme from 2007 to 2011 and from 2013 to 2017. In 2012 the LCT Program operated as an Erasmus Mundus Brand Name. Intake 2018: the program commits to respecting the Erasmus Mundus requirements and to maintaining the high quality of the implementation during the years of funding.

Application requirements

Application is made online.

All applicants must satisfy the following prerequisites:

1. Degree: a Bachelor degree or equivalent in the area of (Computational) Linguistics, Language Technology, Cognitive Sciences, Computer Science, Mathematics, Artificial Intelligence, or other relevant disciplines. The degree must be completed before the course starts.

2. Language proficiency: Proof of proficiency in English is required for applicants whose native language is other than English.

Further information about the program and application instructions:

Contact email address for further inquiries:
lct-info at

Program coordination:
Dr. Ing. Ivana Kruijff-Korbayova
Department of Language Science and Technology
Saarland University
Saarbruecken, Germany

Рубрика: Без рубрики | Добавить комментарий

Deep Learning for NLP, advancements and trends in 2017

Over the past few years, Deep Learning (DL) architectures and algorithms have made impressive advances in fields such as image recognition and speech processing.

Their application to Natural Language Processing (NLP) was less impressive at first, but has now proven to make significant contributions, yielding state-of-the-art results for some common NLP tasks. Named entity recognition (NER), part of speech (POS) tagging or sentiment analysis are some of the problems where neural network models have outperformed traditional approaches. The progress in machine translation is perhaps the most remarkable among all.

In this article I will go through some advancements for NLP in 2017 that rely on DL techniques. I do not pretend to be exhaustive: it would simply be impossible given the vast amount of scientific papers, frameworks and tools available. I just want to share with you some of the works that I liked the most this year. I think 2017 has been a great year for our field. The use of DL in NLP keeps widening, yielding amazing results in some cases, and all signs point to the fact that this trend will not stop.

read more

Рубрика: Обзоры/Редакционное | Добавить комментарий

Стали доступными тестовые выборки для дорожки по разрешению лексической многозначности

На конференции Диалог-2018 впервые пройдет дорожка по извлечению смыслов слов из текстов и разрешению лексической многозначности для русского языка. Участники смогут оценить качество работы современных моделей векторных представлений (word sense embeddings) для русского языка и других методов разрешения лексической многозначности. Дорожка проводится при поддержке ACL SIGSLAV и ABBYY.

Мы рады объявить о том, что стали доступными тестовые выборки. Это означает, что участники могут производить загрузки своих решений с использованием платформы CodaLab, которая позволит сразу же увидеть оценку результата. Поучаствовать в дорожке можно до 15 января.

Подробное описание задания, наборов данных и базовых методов решения задачи можно найти по адресу: Инструкция участника и наборы данных доступны на Github:

Важные даты:

— Публикация обучающей выборки: 1 ноября, 2017.
— Публикация тестовой выборки: 15 декабря, 2017.
— Срок подачи моделей: 15 января, 2018.
— Объявление результатов дорожки: 1 февраля, 2018

Вопросы о дорожке можно направлять по адресу

Рубрика: Конференции | Добавить комментарий

The 2nd Workshop on Subword and Character LEvel Models in NLP (SCLeM)

The 2nd Workshop on Subword and Character LEvel Models in NLP (SCLeM)
To be held at NAACL 2018 in New Orleans, Louisiana in June.
Deadline for paper submission: March 2, 2018
Notification of acceptance: April 2, 2018
Camera-ready submission due: April 16, 2018
Workshop: June 5 or 6, 2018
Jacob Eisenstein, Georgia Tech
Graham Neubig, CMU
Barbara Plank, University of Groningen
Brian Roark, Google
Traditional NLP starts with a hand-engineered layer of representation, the level of tokens or words.  A tokenization component first breaks up the text into units using manually designed rules. Tokens are then processed by components such as word segmentation, morphological analysis and multiword recognition.  The heterogeneity of these components makes it hard to create integrated models of both structure within tokens (e.g., morphology) and structure across multiple tokens (e.g., multi-word expressions). This approach can perform poorly (i) for morphologically rich languages, (ii) for noisy text, (iii) for languages in which the recognition of words is difficult and (iv) for adaptation to new domains; and (v) it can impede the optimization of preprocessing in end-to-end learning.
Following the success of the first edition of SCLeM at EMNLP 2017, with over 100 participants, this workshop provides a forum for discussing recent advances as well as future directions on sub-word and character-level natural language processing and representation learning that address these problems.
— tokenization-free models
— character-level machine translation
— character-ngram information retrieval
— transfer learning for character-level models
— models of within-token and cross-token structure
— NL generation (of words not seen in training etc)
— out of vocabulary words
— morphology & segmentation
— relationship b/w morphology & character-level models
— stemming and lemmatization
— inflection generation
— orthographic productivity
— form-meaning representations
— true end-to-end learning
— spelling correction
— efficient and scalable character-level models

Читать далее

Рубрика: Конференции | Добавить комментарий