Salaried 4-year PhD Position in Computational Linguistics/NLP at Stockholm University

The Department of Linguistics at Stockholm University is looking for a
new PhD candidate in the area of computational linguistics/natural
language processing. PhD candidates are regular employees of Stockholm
University, with a starting salary of 25,300 SEK (2,650 EUR; 3,200 USD)
per month and the same benefits and social security as other University
employees. The position is fully funded for 4 years. Extension up to one
year is possible if the candidate performs teaching or other duties at
the department, and further extension is granted in case of parental or
sick leave.

The choice of thesis topic is not restricted to a particular project,
but should be aligned with the research profile of the department.
Possible topics include multilingual NLP methods, machine translation,
or computational methods for other areas of research at the department
(language acquisition, linguistic typology, phonetics, sign language).

Potential applicants are encouraged to contact Robert Östling
(robert[å] to discuss possible thesis projects, or other issues
related to the position.

More information and application form:

*** Deadline: October 16th, 2017 ***

Рубрика: Вакансии/Стажировки | Добавить комментарий

CfP: Natural Language Processing in Artificial Intelligence — NLPinAI 2018

Special Session on
Natural Language Processing in Artificial Intelligence — NLPinAI 2018
16-18 January, 2018 — Funchal, Madeira, Portugal
Within the 10th International Conference on Agents and Artificial Intelligence — ICAART 2018
Computational and technological developments that incorporate natural language are proliferating. Adequate coverage encounters difficult problems related to partiality, underspecification, and context-dependency, which are signature features of information in nature and natural languages. Furthermore, agents (humans or computational systems) are information conveyors, interpreters, or participate as components of informational content. Generally, language processing depends on agents’ knowledge, reasoning, perspectives, and interactions.
The session covers theoretical work, advanced applications, approaches, and techniques for computational models of information and its presentation by language (artificial, human, or natural in other ways). The goal is to promote intelligent natural language processing and related models of thought, mental states, reasoning, and other cognitive processes.
We invite contributions relevant to the following topics, without limiting to them:
— Type theories for applications to language and information processing
— Computational grammar
— Computational syntax
— Computational semantics of natural languages
— Computational syntax-semantics interface
— Interfaces between morphology, lexicon, syntax, semantics, speech, text, pragmatics
— Parsing
— Multilingual processing
— Large-scale grammars of natural languages
— Interfaces between morphology, lexicon, syntax, semantics, speech, text, pragmatics
— Models of computation and algorithms for natural language processing
— Computational models of partiality, underspecification, and context-dependency
— Models of situations, contexts, and agents, for applications to language processing
— Information about space and time in language models and processing
— Models of computation and algorithms for linguistics
— Data science in language processing
— Machine learning of language
— Interdisciplinary methods
— Integration of formal, computational, model theoretic, graphical, diagrammatic, statistical, and other related methods
— Logic for information extraction or expression in written and spoken language
— Language processing based on biological fundamentals of information and languages
— Computational neuroscience of language
Paper Submission: November 7, 2017
Authors Notification: November 21, 2017
Camera Ready and Registration: November 29, 2017

Читать далее

Рубрика: Конференции | Добавить комментарий

Two Research Scientist job openings at IBM Research Dublin

Two research scientist positions on Knowledge Management and NLP are currently open in IBM Research Dublin. Details are given below.

Читать далее

Рубрика: Вакансии/Стажировки | Добавить комментарий

ainl program

Опубликована программа AINL: .

AINL — это семнадцать отобранных после мучительных раздумий докладов, шестнадцать интереснейших демо и постеров, два тюториала от ведущих европейских специалистов, индустриальная сессия, хакатон, пирожки с повидлом и радость общения с единомышленниками.

До конца недели регистрация в полтора раза дешевле. Не упустите свой шанс!


Рубрика: Конференции | Добавить комментарий

Second Call for Papers: Special Issue of the journal Computational Linguistics on Language in Social Media


Special Issue of the journal Computational Linguistics on:

Language in Social Media: Exploiting discourse and other contextual information

*** Deadline 15th October 2017 (11:59 pm PST) ***

For more details see:


**Guest editors**

Farah Benamara — IRIT, Toulouse University (

Diana Inkpen — University of Ottawa (

Maite Taboada — Simon Fraser University (


socialmedia.coli AT

**Call for papers**

Social media content (SMC) is changing the way people interact with each other and share information, personal messages, and opinions about situations, objects and past experiences. This content (ranging from blogs, fora, reviews, and various social networking sites) has specific characteristics that are often referred as the five V’s: volume, variety, velocity, veracity, and value. Most of them are short online conversational posts or comments often accompanied by non-linguistic contextual information, including metadata such as the social network of each user and their interactions with other users. Exploiting the context of a word or a sentence increases the amount of information we can get from it and enables novel applications. Such rich contextual information, however, makes natural language processing (NLP) of SMC a challenging research task. Indeed, simply applying traditional text mining tools is clearly sub-optimal, as such methods take into account neither the interactive dimension nor the particular nature of this data, which shares properties of both spoken and written language.

Most research on NLP for social media focuses primarily on content-based processing of the linguistic information, using lexical semantics (e.g., discovering new word senses or multiword expressions) or semantic analysis (opinion extraction, irony detection, event and topic detection, geo-location detection) (Londhe et al., 2016; Aiello et al., 2013; Inkpen et al., 2015; Ghosh et al., 2015). Other research explores the interactions between content and extra-linguistic or extra-textual features like time, place, author profiles, demographic information, conversation thread and network structure, showing that combining linguistic data with network and/or user context improves performance over a baseline that uses only textual information (West et al., 2014; Karoui et al., 2015; Volkova et al., 2014; Ren et al., 2016).

We expect that papers in this special issue will contribute to a deeper understanding of these interactions from a new perspective of discourse interpretation. We believe that we are entering a new age of mining social media data, one that extracts information not just from individual words, phrases and tags, but also uses information from discourse and the wider context. Most of the “big data” revolution in social media analysis has examined words in isolation, a “bag-of-words” approach. We believe it is possible to investigate big data, and social media data in general, by exploiting contextual information.

We encourage submission of papers that address deep issues in linguistics, computational linguistics and social science. In particular, our focus is on the exploitation of contextual information within the text (discourse, argumentation chains) and extra-linguistic information (social network, demographic information, geo-location) to improve NLP applications and help building pragmatic-based NLP systems. The special issue aims also to bring researchers that propose new solutions for processing SMC in various use-cases including sentiment analysis, detection of offensive content, and intention detection. These solutions need to be reliable enough in order to prove their effectiveness against shallow bag-of-words approaches or content-based approaches alone.
Читать далее

Рубрика: Журналы | Добавить комментарий

Хакни плагиатора!

Хакатон по поиску плагиата в русских текстах

Подробности хакатона

Итак, 22-23 сентября на конференции AINL в Санкт-Петербурге пройдет хакатон по определению парафразированных заимствований в текстах. Прямое практическое применение решений этой задачи — нахождение некорректных заимствований, то есть, плагиата и составление карты заимствованных фрагментов текста.

Что нужно будет сделать?

Задача, которую предстоит решить участникам хакатона, состоит в следующем. Дано некоторое количество пар текстов. В каждой паре один текст является “подозрительным”, а другой — “источником” (у одного “подозрительного” текста может быть несколько источников). Известно, что “подозрительные” тексты содержат заимствованные из источников куски текста, причем эти куски перефразированы, чтобы затруднить работу систем обнаружения плагиата. Например, предложение

Книга с заголовком «Познавательная кулинария с шеф–поваром Watson» выйдет уже завтра, 14 апреля.”

превращается в

Она вышла в середине апреля 2014 года под названием «Познавательная кулинария с шеф-поваром Watson».”


В 2005 году компания впервые выходит за пределы Российской Федерации, открывая представительство на Украине”

превращается в

Уже в 2005 году компания начала выходить на иностранный рынок, открыв офис управления на Украине.”

Такие заимствования не обнаруживаются простым поиском совпадающих строк. Следовательно, требуются более эффективные техники. Мы не сомневаемся, что участники хакатона смогут их найти.

Важно: задача нахождения текстов-кандидатов на заимствования в данном хакатоне не решается! Источники для каждого текста уже известны, нужно лишь найти точные границы заимствованных фрагментов.

Читать далее

Рубрика: Конференции, Ресурсы/Софт | Добавить комментарий

Second Call for Chapters: Techno-Social Systems for Modern Economical and Governmental Infrastructures


Alexander Troussov and Sergey Maruev (The Russian Presidential Academy of National Economy and Public Administration)




Nowadays, most of the digital content is generated within public and enterprise techno-social systems like Facebook, Twitter, blogs, wiki systems, and other web-based collaboration and hosting tools, office suites, and project management tools. Enterprises use software tools for social collaboration and team collaboration, such as Microsoft SharePoint, IBM Lotus Notes and others. These applications have transformed the collaboration environment from a mere document collection into a highly interconnected social space, where documents are actively exchanged, filtered, organized, discussed and edited collaboratively. In techno-social systems infrastructures are composed of many layers (such as Internet communication protocols, markup languages, metadata models, knowledge representation languages which have spanned over two decades) and interoperate within a social and organizational context that drives their everyday use and development. Proprietary data bases, such as customs records, contrary to the log-files of techno-social systems, frequently have data about the collaboration which happens between actors outside of the systems. By extension, we can apply the term techno-social systems to both types of data. Such generalisation simplifies knowledge transfer between different domains and types of applications.


In these techno-social systems “everything is deeply intertwingled” using the term coined by the pioneer of information technologies Ted Nelson: people are connected to other people and to “non-human agents” such as documents, datasets, analytic tools and concepts. These networks become increasingly multidimensional, providing rich context for understanding the role of particular nodes that represent both people and abstract concepts.


Techno-social systems bear most of the general characteristics of Big Data; for instance, in these systems, it is frequently easier to predict agents’ actions than to explain them. Mining of techno-social systems constitutes a new distinctive branch of Business Information Systems.

Читать далее

Рубрика: Конференции | Добавить комментарий

IWCLUL 2018 — Fourth International Workshop on Computational Linguistics for Uralic Languages

Call for papers
The purpose of the conference series International Workshop on Computational Linguistics for Uralic Languages is to bring together researchers working on computational approaches to working with these languages. We accept long and short papers as well as tutorial proposals working on the following languages: Finnish, Hungarian, Estonian, Võro, the Sámi languages, Komi (Zyrian, Permyak), Mordvin (Erzya, Moksha), Mari (Hill, Meadow), Udmurt, Nenets (Tundra, Forest), Enets, Nganasan, Selkup, Mansi, Khanty, Veps, Karelian (Olonets), Karelian, Ingrian (Izhorian), Votic, Livonian, Ludic, and other related languages.
All Uralic languages exhibit rich morphological structure, which makes processing them challenging for state-of-the-art computational linguistic approaches, the majority also suffer from a lack of resources and many are endangered.
Research papers should be original, substantial and unpublished research, that can describe work-in-progress systems, frameworks, standards and evaluation schemes. Demos and tutorials will present systems and standards towards the goal of interoperability and unification of different projects, applications and research groups Appropriate topics include (but are not limited to):
Parsers, analysers and processing pipelines of Uralic languages
Lexical databases, electronic dictionaries
Finished end-user applications aimed at Uralic languages, such as spelling or grammar checkers, machine translation or speech processing
Evaluation methods and gold standards, tagged corpora, treebanks
Reports on language-independent or unsupervised methods as applied to Uralic languages
Surveys and review articles on subjects related to computational linguistics for one or more Uralic languages
Any work that aims at combining efforts and reducing duplication of work
How to elicit activity from the language community, agitation campaigns, games with a purpose
To maximise the possibility of reproducibility, replication and reuse, we particularly encourage submissions which present free/open-source language resources and make use of free/open-source software.
One of the aims of this gathering is to avoid unnecessary duplicated work in field of Uralistics by establishing connections and interoperability standards between researchers and research groups working at different sites. We have also identified a serious lack of gold standards and evaluation metrics for all Uralic languages including those with national support, any work towards better resources in these fields will be greatly appreciated. In this year’s edition, we continue our tradition of particularly encouraging researchers of minority Uralic languages in Russia to participate. <>

Читать далее

Рубрика: Без рубрики | Добавить комментарий


BigDat 2018

Timișoara, Romania

January 22-26, 2018

Organized by:
West University of Timișoara
Rovira i Virgili University



BigDat 2018 will be a research training event with a global scope aiming at updating participants about the most recent advances in the critical and fast developing area of big data, which covers a large spectrum of current exciting research and industrial innovation with an extraordinary potential for a huge impact on scientific discoveries, medicine, engineering, business models, and society itself. Renowned academics and industry pioneers will lecture and share their views with the audience.

Most big data subareas will be displayed, namely foundations, infrastructure, management, search and mining, security and privacy, and applications (to biological and health sciences, to business, finance and transportation, to online social networks, etc.). Major challenges of analytics, management and storage of big data will be identified through 2 keynote lectures, 25 five hour and fifteen minute-courses, and 1 round table, which will tackle the most active and promising topics. The organizers are convinced that outstanding speakers will attract the brightest and most motivated students. Interaction will be a main component of the event.

An open session will give participants the opportunity to present their own work in progress in 5 minutes. Also, there will be two special sessions with industrial and recruitment profiles.


Master students, PhD students, postdocs, and industry practitioners will be typical profiles of participants. However, there are no formal pre-requisites for attendance in terms of academic degrees. Since there will be a variety of levels, specific knowledge background may be assumed for some of the courses. Overall, BigDat 2018 is addressed to students, researchers and practitioners who want to keep themselves updated about recent developments and future trends. All will surely find it fruitful to listen and discuss with major researchers, industry leaders and innovators.

Читать далее

Рубрика: Курсы/Образование/Постдоки | Добавить комментарий

Job Opening: PhD studentship position on natural language processing in A Coruña, Spain

We invite applications for a full-time PhD student position at the
Universidade da Coruña, in A Coruña (Spain), in the context of the 5-year
European Research Council Starting Grant project «Fast Natural Language
Parsing for Large-Scale NLP» ( ), led by
Prof. Carlos Gómez-Rodríguez.

The research will be focused on developing new models, algorithms and
techniques for efficient parsing of natural language text. This vacancy
offers attractive working conditions for up to 4.5 years in this frontier
research project. The intended starting date (negotiable) is September 2017.

Detailed information about the positions, salary, requisites, research
environment and how to apply can be found at the following link:

Potential candidates are welcome to ask any questions they may have about
the project or the job offer by email to cgomezr[å] .

Рубрика: Вакансии/Стажировки, Курсы/Образование/Постдоки | Добавить комментарий