Postdoctoral Fellow in Machine Learning and Computational Linguistics

We are seeking a skilled postdoctoral fellow whose expertise intersects
machine learning and computational linguistics. The candidate is expected
to make novel contributions to these disciplines in the context of
healthcare. The domain of the research is largely open-ended. This may
include textual processing of the medical record, speech recognition with
atypical or pathological voices, and human-computer dialogue using modern
recurrent neural networks, especially with situated robots.

Work can commence as soon as August 2017. The initial contract is for 1
year although extension is possible.

The successful applicant will have:
1) A doctoral degree in a relevant field of computer science,
electrical engineering, biomedical engineering, or a relevant
2) Evidence of impact in research through a strong publication record
in relevant venues;
3) Evidence of strong collaborative skills, including possible
supervision of junior researchers, students, or equivalent industrial
4) Excellent interpersonal, written, and oral communication skills;
5) A strong technical background in machine learning, natural
language processing, and speech recognition. Experience in
human-computer interaction is an asset. Experience with clinical
populations is preferred.

This work will be conducted at the Toronto Rehabilitation Institute and at
the University of Toronto.

Рубрика: Вакансии/Стажировки, Курсы/Образование/Постдоки | Добавить комментарий


The University of Helsinki – among the best in the world!

Founded in 1640, the University of Helsinki is one of the best multidisciplinary research universities in the world. The high-quality research carried out by the university creates new knowledge for educating diverse specialists in various fields, and for utilisation in social decision-making and the business sector.

The Faculty of Arts is a significant international community fostering research, education and cultural interaction. The Faculty has 7,000 undergraduate and postgraduate students and employs 500 experts.

Heldig is a newly-founded Digital Humanities Centre at the University of Helsinki. The objective of Heldig is to foster the combining of computational methods in humanities and social science and study different phenomena related to digitalization. Heldig also provides teaching in digital humanities. One of the research strands in Heldig is the historical and linguistic study of public discourse, conceptual change and knowledge production.

The Faculty of Arts invites applications for


for a fixed term period from 1 September 2017 onwards (or as agreed) for a maximum of three years. The period (1-3 years) depends of the candidate’s research plan.

We are looking for candidates with expertise on computational science, linguistics and history/cultural heritage. The three successful candidates will be members of a research community that already includes, for example, Academy of Finland’s project on “Computational History and the Transformation of Public Discourse”, 2015-2019. The data that the research community is using includes various historical full-text collections and large metadata collections, mainly in English, Finnish and Swedish. The group is particularly interested in studying conceptual change, intertextuality based on text-reuse and statistical analysis of knowledge production.

Читать далее

Рубрика: Без рубрики | Добавить комментарий

PAN shared task on Gender Identification in Russian texts

PAN shared task on Gender Identification in Russian texts (RusProfiling)

held in conjunction with the FIRE 2017 Forum for Information Retrieval Evaluation

8th — 10th December 2017, Bangalore

Author profiling consists of predicting an author’s demographics from his/her writing, with gender identification being the most popular task. Slavic languages, however, are less investigated from author profiling standpoint and have never been presented at PAN.

This year we introduce a PAN shared task on Cross-genre Gender Identification in Russian texts where we will provide as training dataset tweets and as test dataset tweets, Facebook posts, as well as reviews, texts describing images, or letters to a friend.

We cordially invite all researchers and practitioners from all fields to participate in this year’s PAN @ FIRE shared task.

Important Dates

  • 30th June, 2017 Release of training corpus (training period starts)
  • 1st September, 2017 Release of test corpus
  • 20th September, 2017 Submission of runs
  • 27th September, 2017 Results notification
  • 15th October, 2017 Working notes due


Task Coordinators

Tatiana Litvinova, RusProfiling Lab, Russia

Pavel Seredin, RusProfiling Lab, Russia

Olga Litvinova, RusProfiling Lab, Russia

Paolo Rosso, PRHLT research centre, Universitat Politècnica de València, Spain

Francisco Rangel, PRHLT research centre, Universitat Politècnica de València and Autoritas Consulting, Spain


E-mail: centrrusya[at]
Track Web page:

Рубрика: Без рубрики | Добавить комментарий


Hotel “Cherno More”, Varna, Bulgaria
4 — 6, September 2017 (during RANLP 2017) [1]

Further to the previous successful and highly competitive Student Research
Workshops associated with the conference 'Recent Advances in Natural Language
Processing' (RANLP, in 2009, 2011, 2013, and 2015), we are pleased to
announce the fifth edition of the workshop which will be held during the main
RANLP 2017 conference days on 4-6 September 2017. For the first time the
conference and the workshop will take place at the Black Sea city of Varna,

The International Conference RANLP 2017 would like to invite students at all
levels (Bachelor-, Master-, and PhD-students) to present their ongoing or
completed work at the Student Research Workshop. We invite two types of
student submissions:
— Full Papers – unpublished original research of the student.
— Short Papers – either a work in progress or a research proposal.

The aim of this workshop is to facilitate the exchange of knowledge between
young researchers by providing an excellent opportunity to present and
discuss their work in progress or completed projects to an international
research audience and receive feedback from senior researchers. The research
being presented can come from any topic area within Natural Language
Processing (NLP) and computational linguistics, including but not limited to
the following topic areas:

phonetics and phonology, morphology, lexicon, syntax, semantics, discourse,
pragmatics, dialogue, mathematical foundations, formal grammars and
languages, finite-state technology, statistical models for natural language
processing, machine learning, word embeddings, deep learning for NLP,
similarity, evaluation, sublanguages and controlled languages, lexicography,
language resources and corpora, terminology, corpus annotation, ontologies,
complexity, text segmentation, POS tagging, parsing, semantic role labelling,
word-sense disambiguation, computational treatment of multiword expressions,
textual entailment, anaphora and coreference resolution, temporal processing,
natural language generation, speech recognition, text-to-speech synthesis,
knowledge acquisition, text categorisation, machine translation, including
statistical machine translation and neural machine translation, translation
technology including translation memory systems, information retrieval,
information extraction, event extraction, question answering, text
summarisation, term extraction, text and web mining, opinion mining and
sentiment analysis, multimodal systems, natural language processing for
educational applications, automated writing assistance, text simplification,
NLP for biomedical texts, author profiling and related applications,
application-orientated papers related to NLP, chatbots, fact checking,
computer-aided language learning, stance detection, computational cognitive
modelling, dialect processing, language and vision, multilingual NLP, NLP for
language disorders, NLP for social media, NLP for the semantic web, patents
search, theoretical NLP, theoretical papers related to NLP.

Papers at the borderline between two sciences (such as Translation Studies,
Psycholinguistics, etc.), but bearing contributions to NLP will be also
accepted for review. All accepted papers will be presented at the Student
Workshop sessions during the main conference days: 4-6 September 2017. The
articles will be issued in a special Student Session proceedings and uploaded
to the ACL Anthology.


Submission deadline: 04 July 2017
Acceptance notification: 10 August 2017
Camera-ready deadline: 20 August 2017
Workshop: 04-06 September 2017
Читать далее

Рубрика: Конференции | Добавить комментарий


Очень  актуальная программа в этом году на Руссир: почти половина про нейросети. Действительно, в последние несколько лет мир информационного поиска (и вообще прикладного IT, включая языковые технологии) сильно изменился, огромную роль стали играть подходы, основанные на глубоких сетях. Там столько всего происходит и так быстро все меняется, что включится в эту тема с нуля и разобраться самостоятельно может быть сложно. Так что школа может оказаться очень полезна.

Среди преподавателей известные ученые. В общем, рекомендую, а то не съездите в этом году — наука уйдет вперед и не догоните ее!


11th Russian Summer School in Information Retrieval (RuSSIR 2017)
August 21-25, 2017, Yekaterinburg, Russia

***Application deadline: June 25, 2017***

The 11th Russian Summer School in Information Retrieval (RuSSIR 2017)
will be held on August 21-25, 2017 in Yekaterinburg, Russia. The school
is co-organized by the Ural Federal University ( and
the Russian Information Retrieval Evaluation Seminar (ROMIP, The RuSSIR 2017 will have a special focus on neural
networks and their applications for the Information Retrieval.

The missions of the RuSSIR school series are to enable students to learn
about modern problems and methods in information retrieval and related
disciplines; to stimulate scientific research and collaboration in the
field; to create an environment for successful networking between
scientists, students and industry professionals.

RuSSIR 2017 will offer the following keynotes and courses:

* Ruslan Salakhutdinov (Carnegie Mellon University, USA) — «Foundations
of Deep Learning»
* Jaap Kamps (University of Amsterdam, the Netherlands) — «Foundations
of Information Retrieval and its Future»

* Ying-Hsang Liu (Charles Sturt University, Australia) — «Design and
Implementation of User Experiments in Information Retrieval»
* Giorgio Maria Di Nunzio (University of Padua, Italy) — «An Interactive
<<View>> of Probabilistic Models for Text Retrieval, Classification,
* Efstratios Gavves (University of Amsterdam, The Netherlands) — «Deep
Learning for Language and Vision»
* Stefan Rueger (The Open University, United Kingdom) — «Visual
retrieval and mining»
* Mikhail Burtsev and Valentin Malykh (MIPT, Russia) — «Conversational
Intelligence: Deep Learning Approach»
* Tom Kenter, Alexey Borisov, Christophe Van Gysel, Mostafa Dehghani,
Maarten de Rijke and Bhaskar Mitra (University of Amsterdam, the
Netherlands; Yandex, Russia; Microsoft Bing, United Kingdom) — «Neural
Networks for Information Retrieval»
Читать далее

Рубрика: Без рубрики | Добавить комментарий

AINL публикуется в Springer

AINL: Artificial Intelligence and Natural Language
St-Petersburg, Russia, 20-23 September 2017

The 6th conference on Artificial Intelligence and Natural Language invites everybody interested in intellectual technologies, both from academic institutes and innovative companies. The conference aimed to bring together experts in the areas of text mining, speech technologies, dialogue systems, information retrieval, machine learning, artificial intelligence and robotics; to create a platform for sharing experience, extending contacts and searching for possible collaboration.

The AINL series has been organized since 2012 and has developed a set of distinctive features:
— a strong practical focus: industrial talks and product demonstrations is an essential part of the conference program;
— an interactive component: the conference programs includes a number of workshops and panel discussions, as well as poster session and other interactive forms;
— an encouraging attitude towards students and researchers on the early stage of career.
All together, this makes AINL a nice get-together opportunity.

• Natural Language Processing
• Information Retrieval
• Context Analysis, Text Mining
• Artificial Intelligence, Deep Learning, Machine Learning for NLP
• Social Media and Social Network Analysis
• Linked Data and Semantic Web
• Big Data and Data Mining
• Plagiarism Detection, Author Profiling and Authorship Detection
• Machine Translation, Crosslingual and Multilingual applications
• Speech Generation and Recognition, Spoken language processing
• Human-Computer Interfaces, Dialogue systems
• Robotics, Cyber-Physical Systems


15 June 2017 — Paper submission deadline
15 July 2017 — Notification of Acceptance
1 August 2017 — Final paper submission date
1 September 2017 — Submission deadline for industry talks, demo and posters
8 September 2017 — Notification for industry, demo and posters
20-23 September 2017 — Conference



We accept Full Papers (up to 12 pages) and Short Papers (up to 6 pages). Full papers should describe original, complete, previously unpublished research; these papers will be presented during the conference as oral presentations. Short Papers describe research in progress or negative results. All submitted papers will be peer reviewed. All papers will be reviewed by at least 3 members of the program committee. This year we will use a double-blind review scheme.

At least one author of accepted long and short papers must register for the conference and present the paper. Accepted and presented full papers and a selection of short papers will be published in AINL proceeding in Springer series Communications in Computer and Information Science.

Submit your paper via easychair:


We invite demo and posters without publication. In this section it is possible to present previously published work or work in progress. Demo and posters will be reviewed by organizing committee. Note, that if you need visa to come to Russia you must contact us much earlier than the poster submission deadline. To participate, submit a one page description of your talk/demonstration via google form:


We organize a number of industrial talks and interactive demonstration. To participate, submit a one page description of your talk/demonstration via google form:
The submissions will be evaluated by organizing committee.

Читать далее

Рубрика: Конференции | Добавить комментарий

IQLA-GIAT Summer School in Quantitative Analysis of Textual Data

Очередная реинкарнация итальянской школы по анализу текста. Судя по составу преподавателей, будет отлично. Особенно должно быть интересно тем, кто интересуется определением авторства, плагиатом и тому подобным.


University of Padua, 4-8 September 2017

3rd edition


(download: CALL SummerSchool 2017)

In your next research project, are you planning to take into account a large number of novels, newspaper articles , transcriptions of open-ended interviews, or comments posted on social media?

Are there definitely too many texts for any scholar to read in a life-span perspective?

Why not trying to ask a computer to do this for you?

A software package cannot “close read” a text. On the contrary, by means of mathematical and statistical tools, it might be smart enough to “distant read” a text, i.e. collecting data, retrieving relevant information, summarizing features, finding patterns, etc. Instead of close-reading a limited number of texts, why not working with thousands of texts, uploading them into the memory of a computer and asking a software package to produce analyses and results?


Digital methods have been utilised by a variety of disciplines and the growing availability of large corpora and large databases (the BIG DATA era) calls for new methods to deal with new problems, open the door to new questions and develop new knowledge.

“Quantitative analysis of textual data”, “Digital Methods” and “Distant Reading” are general terms that refer to a wide range of methods sharing a common aim: retrieving and summarising information from texts by means of computer-aided tools. Today, computer-aided text analysis is an umbrella term referring to a number of qualitative, quantitative and mixed-methods approaches. It is an object of research in many sectors of linguistics, computer sciences, mathematics and statistics and it is used as a research tool within a number of disciplines such as psychology, philosophy, sociology, sociolinguistics, education, history, political studies, literary studies, communication and media studies. The recent evolution of information technologies (IT) and quantitative methods has led to a number of distinct but interrelated sectors (e.g. computational linguistics, information retrieval, natural language processing, text mining, text analytics, sentiment analysis, opinion mining, topic extraction, etc.) with interesting industrial applications (e.g. electronic dictionaries, artificial intelligence, computer-aided translation, plagiarism detection, web reputation).

Recent studies have repeatedly stressed the need for developing, adopting and sharing interdisciplinary approaches and the IQLA-GIAT Summer School is the ideal environment for developing innovative analytical tools by pooling together the research methods of different disciplines.

The IQLA-GIAT Summer School is characterized by three main elements:

  1. a general part devoted to quantitative linguistics;
  2. a special issue addressing a relevant methodological problem (2017: topic detection and authorship attribution in Elena Ferrante’s case-study; 2015: measuring style and computational stylistics; 2013 measures and methods in authorship attribution);
  3. several lab-sessions dedicated to computer-aided analysis of textual data.

This year’s IQLA-GIAT Summer School will also include a Workshop on Elena Ferrante’s case-study.


Teaching activities will raise questions that can be answered by implementing quantitative methods and other procedures that may be used to identify and compare the characteristics of texts within a text analysis framework. The aim is to discuss with students, young researchers and scholars of different disciplines the strengths, weaknesses, opportunities and threats of text analysis quantitative methods . The participants selected for the IQLA-GIAT Summer School have the opportunity to exploit different tools within the same environment and the same tool in different environments.

The IQLA-GIAT Summer School aims at:

  1. sharing information on software, corpora, relevant literature and research results;
  2. promoting a dialogue among different disciplines on current research issues;
  3. developing innovative analytical tools and integrated research methods;
  4. introducing students and young researchers to new strains of research and applications;
  5. sharing state-of-the-art knowledge in:
    • Quantitative linguistics;
    • Digital methods for text analysis (topic detection, text classification);
    • Authorship Attribution methods and dedicated software packages;
    • Content mapping and data visualization;
    • Computer-aided analysis of textual data.

Читать далее

Рубрика: Курсы/Образование/Постдоки | Добавить комментарий

Три PhD позиции в университете Гронингена

Есть еще 5 дней до окончания приема заявок в PhD в университет Гронингена, Campus Fryslân (подробная инфорамация по ссылке
We offer three four-year scholarships to complete a PhD in Culture, Language & Technology. Applications are invited from prospective PhD students with a good fit with the research specialisms and expertise of our academic staff. The PhD students will be enrolled in our Graduate School Campus Fryslân (GSCF), and may collaborate with senior researchers at the Fryske Akademy and the Department of Frisian Language and Culture. Additionally, they can benefit from affiliations at University Groningen Research Institutes like the Center for Language and Cognition or the Research School of Behavioral and Cognitive Neurosciences, as appropriate to the PhD project.

Candidates whose research interests relate to linguistic/cultural situations in the North of the Netherlands, particularly Fryslân, are encouraged to apply. Topics of interest follow.

● Language & society: Investigations of language contact and/or variation in a multilingual society, consequences of multilingualism for language change and language learning, comparisons of language attitudes specifically relating to Frisian accent
● Language & speech technologies: Studies of language and speech technologies that support or investigate a diversity of natural, multilingual interactions between people and the devices that surround them, with a keen eye on applications with real-world relevance, for example intoxication or pathology detection/recognition in multilingual discourse or multilingual prosody perception in the hearing-impaired;
● The future of multilingualism & minority languages: Research into consequences of globalization and migration for citizenship and expressions of linguistic and cultural identity in multilingual contexts, preferably also involving the North of the Netherlands.


The candidate should have the following qualifications:
● Master’s degree in relevant field (Linguistics, Anthropology, Cultural Studies, or similar) awarded near September 1st 2017, at the latest. Applied research experience in the private sector is an asset
● excellent record of undergraduate and graduate study
● strong motivation to complete a PhD dissertation in four years
● excellent command of spoken and written English. Additionally, Dutch, Frisian and/or German skills (or the ability to learn these languages quickly) is an asset.

Читать далее

Рубрика: Курсы/Образование/Постдоки | Добавить комментарий

The First Workshop on Subword and Character LEvel Models in NLP (SCLeM)

To be held at EMNLP 2017 in Copenhagen on September 7, 2017



Submission deadline: June 2, 2017
Notification: June 30, 2017
Camera ready: July 14, 2017
Workshop: September 7, 2017


Kyunghyun Cho, NYU
Karen Livescu, TTIC
Tomas Mikolov, Facebook
Noah Smith, Univ of Washington


Neural weighted finite-state machines, Ryan Cotterell, JHU


Traditional NLP starts with a hand-engineered layer of representation,
the level of tokens or words.  A tokenization component first breaks
up the text into units using manually designed rules. Tokens are then
processed by components such as word segmentation, morphological
analysis and multiword recognition.  The heterogeneity of these
components makes it hard to create integrated models of both structure
within tokens (e.g., morphology) and structure across multiple tokens
(e.g., multi-word expressions). This approach can perform poorly (i)
for morphologically rich languages, (ii) for noisy text, (iii) for
languages in which the recognition of words is difficult and (iv) for
adaptation to new domains; and (v) it can impede the optimization of
preprocessing in end-to-end learning.

The workshop provides a forum for discussing recent advances as well
as future directions on sub-word and character-level natural language
processing and representation learning that address these problems.


09:00-09:10  Welcome
09:10-09:50  Invited talk 1
09:50-10:30  Invited talk 2
10:30-11:00  Coffee break
11:00-11:40  Invited tutorial talk
11:40-12:10  Best paper presentations
12:10-14:00  Poster session & Lunch break
14:00-14:40  Invited talk 3
14:40-15:40  Poster session & Coffee break
15:40-16:20  Invited talk 4
16:20-17:30  Panel discussion
17:30-17:45  Closing remarks


— tokenization-free models
— character-level machine translation
— character-ngram information retrieval
— transfer learning for character-level models
— models of within-token and cross-token structure
— NL generation (of words not seen in training etc)
— out of vocabulary words
— morphology & segmentation
— relationship b/w morphology & character-level models
— stemming and lemmatization
— inflection generation
— orthographic productivity
— form-meaning representations
— true end-to-end learning
— spelling correction
— efficient and scalable character-level models
Читать далее

Рубрика: Конференции | Добавить комментарий

Summer Neurolinguistics School 2017: Brain Stimulation in Language Research and Therapy/Russia

Host Institution: National Research University Higher School of Economics
Coordinating Institution: National Research University Higher School of Economics

Dates: 22-Jun-2017 — 24-Jun-2017
Location: Moscow, Russia

Focus: The topic of the fourth annual Summer Neurolinguistics School is Brain Stimulation in Language Research and Therapy. The school will be devoted to brain stimulation methods and their applications in neurolinguistic research and speech-language therapy.
Minimum Education Level: No Minimum

The Neurolinguistics Laboratory at the National Research University Higher School of Economics (HSE) invites you to join us in Moscow for our fourth annual Summer Neurolinguistics School. This year, the topic is Brain Stimulation in Language Research and Therapy. The event will take place in Moscow, Russia, June 22–24, 2017. The purpose of the school is to serve both as an educational event for students entering the field and as an academic environment where researchers and clinicians can exchange ideas and discuss the latest achievements in the field. This year the school will be devoted to various brain stimulation methods and their applications in neurolinguistic research and in language therapy. The school program includes lectures by such world-renowned researchers as Nina Dronkers (University of California), Roelien Bastiaanse (University of Groningen), and Dirk-Bart den Ouden (University of South Carolina).

What to expect?
Читать далее

Рубрика: Курсы/Образование/Постдоки | Добавить комментарий