Deep Learning for NLP, advancements and trends in 2017

https://tryolabs.com/blog/2017/12/12/deep-learning-for-nlp-advancements-and-trends-in-2017/

Over the past few years, Deep Learning (DL) architectures and algorithms have made impressive advances in fields such as image recognition and speech processing.

Their application to Natural Language Processing (NLP) was less impressive at first, but has now proven to make significant contributions, yielding state-of-the-art results for some common NLP tasks. Named entity recognition (NER), part of speech (POS) tagging or sentiment analysis are some of the problems where neural network models have outperformed traditional approaches. The progress in machine translation is perhaps the most remarkable among all.

In this article I will go through some advancements for NLP in 2017 that rely on DL techniques. I do not pretend to be exhaustive: it would simply be impossible given the vast amount of scientific papers, frameworks and tools available. I just want to share with you some of the works that I liked the most this year. I think 2017 has been a great year for our field. The use of DL in NLP keeps widening, yielding amazing results in some cases, and all signs point to the fact that this trend will not stop.

read more

Рубрика: Обзоры/Редакционное | Добавить комментарий

Стали доступными тестовые выборки для дорожки по разрешению лексической многозначности

На конференции Диалог-2018 впервые пройдет дорожка по извлечению смыслов слов из текстов и разрешению лексической многозначности для русского языка. Участники смогут оценить качество работы современных моделей векторных представлений (word sense embeddings) для русского языка и других методов разрешения лексической многозначности. Дорожка проводится при поддержке ACL SIGSLAV и ABBYY.

Мы рады объявить о том, что стали доступными тестовые выборки. Это означает, что участники могут производить загрузки своих решений с использованием платформы CodaLab, которая позволит сразу же увидеть оценку результата. Поучаствовать в дорожке можно до 15 января.

Подробное описание задания, наборов данных и базовых методов решения задачи можно найти по адресу: http://russe.nlpub.org/2018/wsi. Инструкция участника и наборы данных доступны на Github: https://nlpub.github.io/russe-wsi-kit/

Важные даты:

— Публикация обучающей выборки: 1 ноября, 2017.
— Публикация тестовой выборки: 15 декабря, 2017.
— Срок подачи моделей: 15 января, 2018.
— Объявление результатов дорожки: 1 февраля, 2018

Вопросы о дорожке можно направлять по адресу rusemantics@googlegroups.com.

Рубрика: Конференции | Добавить комментарий

The 2nd Workshop on Subword and Character LEvel Models in NLP (SCLeM)

*CALL FOR PAPERS**
The 2nd Workshop on Subword and Character LEvel Models in NLP (SCLeM)
To be held at NAACL 2018 in New Orleans, Louisiana in June.
DATES
Deadline for paper submission: March 2, 2018
Notification of acceptance: April 2, 2018
Camera-ready submission due: April 16, 2018
Workshop: June 5 or 6, 2018
INVITED SPEAKERS
Jacob Eisenstein, Georgia Tech
Graham Neubig, CMU
Barbara Plank, University of Groningen
Brian Roark, Google
OVERVIEW
Traditional NLP starts with a hand-engineered layer of representation, the level of tokens or words.  A tokenization component first breaks up the text into units using manually designed rules. Tokens are then processed by components such as word segmentation, morphological analysis and multiword recognition.  The heterogeneity of these components makes it hard to create integrated models of both structure within tokens (e.g., morphology) and structure across multiple tokens (e.g., multi-word expressions). This approach can perform poorly (i) for morphologically rich languages, (ii) for noisy text, (iii) for languages in which the recognition of words is difficult and (iv) for adaptation to new domains; and (v) it can impede the optimization of preprocessing in end-to-end learning.
Following the success of the first edition of SCLeM at EMNLP 2017, with over 100 participants, this workshop provides a forum for discussing recent advances as well as future directions on sub-word and character-level natural language processing and representation learning that address these problems.
TOPICS OF INTEREST
— tokenization-free models
— character-level machine translation
— character-ngram information retrieval
— transfer learning for character-level models
— models of within-token and cross-token structure
— NL generation (of words not seen in training etc)
— out of vocabulary words
— morphology & segmentation
— relationship b/w morphology & character-level models
— stemming and lemmatization
— inflection generation
— orthographic productivity
— form-meaning representations
— true end-to-end learning
— spelling correction
— efficient and scalable character-level models

Читать далее

Рубрика: Конференции | Добавить комментарий

Call for papers: SlaviCorp 2018

to be held in Prague, Czech Republic on 24 — 26 September 2018

Conference website: http://slavicorp.ff.cuni.cz/

Conference topics
The conference aims to cover a wide range of topics related to research on Slavic languages:
— corpus research on any Slavic language, including also contrastive topics using parallel or comparable corpora;
— development of Slavic language resources: corpora, lexica, annotation, tools;
— applications of Slavic language resources for language technologies.

The official language of the conference is English.

Programme
The main conference will consist of plenary lectures and regular presentations of full papers. Full papers will be allowed 20 minutes for presentation and 10 minutes for discussion. Details about the plenaries will be published soon.

The conference will be accompanied by the workshop on language variability and multi-dimensional analysis in the form of a panel of experts involved in variability analyses, including a lecture delivered by Prof. Douglas Biber. The workshop will be free and open to all conference participants.

Venue
The conference will be held at the Faculty of Law, Charles University, Prague. The venue is in the city centre, overlooking the Vltava River, within easy reach by public transport.

Submissions
We invite submissions of extended abstracts that reflect any of the conference topics. The extended abstracts should be between 500 and 800 words long (excluding references) and they are to be submitted on-line through EasyChair: https://easychair.org/conferences/?conf=slavicorp2018

The deadline for submission of the abstracts is 18 February 2018. The abstracts will be anonymously peer-reviewed by the Programme Committee, notification of acceptance will be sent out by the end of April 2018.

Contact
The conference is organized by the Institute of the Czech National Corpus, Faculty of Arts, Charles University, Prague.

All relevant information about the conference is available on http://slavicorp.ff.cuni.cz/ where all updates will also be posted.

Рубрика: Конференции | Добавить комментарий

NAACL workshoplist

NAACL 2018 has a great line of of workshops covering a wide range of topics.  All workshops will be held June 5 & 6 unless otherwise noted.  Details on deadline and submissions can be found at the individual workshop websites.

  • BEA18
    The 13th Workshop on Innovative Use of NLP for Building Educational Applications, The BEA Workshop is a leading contributor to societal need through NLP innovation for educational technology.
  • CLPsych18
    The Fifth Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, This workshop focuses on language technology applications in mental and neurological health.
  • Cognitum18
    Fourth Workshop on Cognitive Knowledge Acquisition and Applications, Acquire knowledge from text, reason in novel situations, and offer explanations in a Banking Question Answering and Reasoning shared task.
  • CRAC18
    Computational models of Reference, Anaphora and Coreference (CRAC), This workshop will be forum for all types of computational work on the interpretation and annotation of anaphora, coreference and reference.
  • Ethics-NLP18
    Second ACL Workshop on Ethics in Natural Language Processing, The second workshop on ethics in NLP: questions, issues, applications.
  • Fig-Lang18
    Workshop on Figurative Language Processing, includes a shared task and invited talks on generation of figurative language.
  • Gen-Deep18
    Generalization in the Age of Deep Learning, Inability to generalize means simple examples cause ML systems to fail spectacularly. We propose a venue to build and test generalization.
  • PEOPLES18
    Second workshop on computational modeling of PEople’s Opinions, PersonaLity and Emotions in Social media: Emotions, opinions, personality, demographics in social media: How to equip NLP models to capture their interaction? With what implications?
  • SCLeM18
    Subword and Character LEvel Models in NLP (SCLeM), Character/subword level models are end2end & address noise, sparsity and linguistic diversity.
  • SemBEaR18
    Computational Semantics Beyond Events and Roles, Intricate semantic phenomena such as negation, modality, factuality, attribution, irony and sarcasm.
  • SemEval-2018
    The 12th International Workshop on Semantic Evaluation, SemEval 2018 invites participation in a broad range of semantic analysis tasks.
  • SpLU18
    Spatial Language Understanding, SpLU covers spatial language meaning representation, learning, semantic extraction, mapping to qualitative reasoning models, multi-modality.
  • *SEM 2018
    *SEM brings together researchers interested in the semantics of natural languages and its computational modeling. The conference embraces symbolic and probabilistic approaches, and everything in between; theoretical contributions as well as practical applications are welcome in the form of long and short papers.
  • Story-NLP18
    Workshop on Storytelling, Work on teaching computers to understand & generate human-like stories! One step in creating intelligence that’s complementary to humans.
  • Style-Var18
    2nd Workshop on Stylistic Variation, Enabling a discussion of shared issues across the many instantiations of stylistic difference across research fields.
  • TextGraphs
    Workshop on Graph-Based Methods for Natural Language Processing, The TextGraphs is a workshop series promoting the synergies between the field of Graph Theory and Natural Language Processing.
  • WiNLP18
    Widening NLP (WiNLP) 2018, We are here to increase awareness of the work URGs do, support URGs in pursuing their research & motivate long-term resources within ACL (Note: will be held June 1).
Рубрика: Конференции | Добавить комментарий

вакансия: Прикладной лингвист (компьютерный лингвист аналитик). Москва

Кoнсaлтингoвaя кoмпaния приглaшaeт прoфeссиoнaльных лингвистoв (студентов последних курсов, выпускников и аспирантов в oблaсти теоретической и прикладной лингвистики, кoмпьютeрнoй лингвистики) для учaстия в прoeктe сoздaния и сoвeршeнствoвaния систeмы aвтoмaтичeскoй oбрaбoтки инфoрмaции (АОТ).
Нaшa кoмпaния сoздaeт мeхaнизм oбрaбoтки и aнaлизa eстeствeннoгo языкa в видe инфoрмaциoнных тeкстoв (финансовые рынки, экономика, политика).
Мы oжидaeм, чтo Вы:

  • oблaдaeтe глубoкими лингвистичeскими знaниями, умeeтe лoгичeски мыслить и вырaжaть свoи мысли;
  • нa «ты» с кoмпьютeрoм и, вoзмoжнo, прoгрaммируeтe (python);
  • oтличнo знaeтe aнглийский язык (и, быть мoжeт, бoльшoe кoличeствo других языкoв);
  • Вaм интeрeснo рeшaть лингвистичeскиe зaдaчки, искaть и нaхoдить oригинaльныe рeшeния;
  • у Вaс eсть твoрчeский пoтeнциaл в сфeрe АОТ и Вы хoтитe рeaлизoвaть eгo;
  • Вaм нaдoeлa тeoрия и вы хoтитe прaктики.

У вaс eсть вoзмoжнoсть рaбoтaть вмeстe с нaми в удoбнoм oфисe в цeнтрe Мoсквы зa кoмфoртным рaбoчим мeстoм и пoлучaть бoнус зa вaши знaния, дoстижeния, твoрчeский пoтeнциaл, в рaзмeрe от 40 000 рублeй и вышe.

Контактные данные:
+7 925 1513723
dev.heuristics@gmail.com
Вячеслав

ООО «Эвристика»

Рубрика: Вакансии/Стажировки | Добавить комментарий

Facebook AI Research Residency Program

Facebook AI Research Residency Program

The Facebook AI Research (FAIR) Residency Program is a one-year research training program with Facebook’s AI Research group, designed to give you hands-on experience of machine learning research. The program will pair you with a senior researcher or engineer in FAIR, who will act as your mentor. Together, you will pick a research problem of mutual interest and then devise new deep learning techniques to solve it. We also encourage collaborations beyond the assigned mentor. The research will be communicated to the academic community by submitting papers to top academic venues (NIPS, ICML, ICLR, CVPR, ICCV, ACL, EMNLP etc.), as well as open-source code releases. Visit the FAIR research page for examples of research performed in FAIR .

The AI research residency experience is designed to prepare you for graduate programs in machine learning, or to kickstart a research career in the field. This is a full-time program that cannot be undertaken in conjunction with university study or a full-time job.

We encourage applications from people who have a strong technical background and are passionate about AI research. Prior experience in machine learning is certainly a strength but we seek people from a diverse range of backgrounds, including areas ostensibly unrelated to machine learning such as (but not limited to) math, physics, finance, economics, linguistics, computational social science, and bioinformatics.

Accepted research residents will be based in Facebook Menlo Park and New York locations. If a candidate requires a visa to work in the US, we will explore what options are available once they are accepted onto the program.

Resident Responsibilities

  • Learn how to perform research in deep learning and AI.
  • Understand prior work and existing literature.
  • Work with research mentor(s) to identify problem(s) of interest and develop novel AI techniques.
  • Translate ideas into practical code (in frameworks such as PyTorch, Caffe 2).
  • Write up research results in the form of an academic paper and submit to a top conference in the relevant area.

Eligibility Requirements

  • Bachelors degree in a STEM field such as Mathematics, Statistics, Physics, Electrical Engineering, Computer Science, or equivalent practical experience.
  • Completed coursework in: Linear Algebra, Probability, Calculus, or equivalent.
  • Coding experience in a general-purpose programming language, such as Python or C/C++.
  • Familiarity with a deep learning platform such as PyTorch, Caffe, Theano, or TensorFlow.
  • Ability to communicate complex research in a clear, precise, and actionable manner.

Preferred Qualifications

  • Research experience in machine learning or AI (as established for instance via publications and/or code releases).
  • Significant contributions to open-source projects, demonstrating strong math, engineering, statistics, or machine learning skills.
  • A strong track record of scholastic excellence.

Application Requirements

To apply to the 2018 Facebook AI Residency Program, you will need to complete the application and submit the following items:

  1. CV/Resume (including links to GitHub, professional webpages, publications, or blogposts as applicable)
  2. Personal statement
  3. Academic grade transcript

 

Important Dates

  • Deadline for applications: January 26, 2018
  • Notification of interview: February 16, 2018
  • Notification of admission: March 5, 2018
  • Deadline for offer acceptance: April 20, 2018
  • Residency Program start: August 2018
  • Residency Program end: August 2019
Рубрика: Вакансии/Стажировки | Добавить комментарий

SemEval-2018 Shared Tasks

This is the third call to participate in the shared tasks of SemEval-2018. You can access the detailed task descriptions following the links provided on the SemEval-2018 webpage: http://alt.qcri.org/s emeval2018/index.php?id=tasks <http://alt.qcri.org/semeval2018/index.php?id=tasks>

TASKS:

Affect and Creative Language in Tweets
? Task 1: Affect in Tweets
? Task 2: Multilingual Emoji Prediction
? Task 3: Irony Detection in English Tweets

Coreference
? Task 4: Character Identification on Multiparty Dialogues
? Task 5: Counting Events and Participants within Highly Ambiguous Data covering a very long tail

Information Extraction
? Task 6: Parsing Time Normalizations
? Task 7: Semantic Relation Extraction and Classification in Scientific Papers
? Task 8: Semantic Extraction from CybersecUrity REports using Natural Language Processing (SecureNLP)

Lexical Semantics
? Task 9: Hypernym Discovery
? Task 10: Capturing Discriminative Attributes

Reading Comprehension and Reasoning
? Task 11: Machine Comprehension using Commonsense Knowledge
? Task 12: Argument Reasoning Comprehension Task

Trial data, training data, and evaluation scripts are now ready and available on the CodaLab website for each task. You can download them through the direct links, when provided, or after registering to participate in the task (?Participate? tab). The task evaluations will take place in January 2018. For more details please consult the SemEval-2018 website and the individual task pages.

IMPORTANT DATES
Mon 08 Jan 2018: Evaluation start
Mon 29 Jan 2018: Evaluation end
Mon 05 Feb 2018: Results posted
Mon 26 Feb 2018: System description paper submissions due
Mon 05 Mar 2018: Task description paper submissions due
Mon 19 Mar 2018: Paper reviews due (for both systems and tasks)
Mon 02 Apr 2018: Author notifications
Mon 16 Apr 2018: Camera ready submissions due

Рубрика: Конференции, Ресурсы/Софт | Добавить комментарий

Available PhD and Post-doc positions in Text and Data Mining in the Knowledge Media institute, United Kingdom

The Knowledge Media Institute (KMi) is a leading research centre associated with the Open University. Research in KMi focuses on web & data science, natural language processing, information retrieval, machine learning and their applications to solve real-world problems. We are currently offering fully-funded PhD studentships as well as post-doc positions in the area of Text and Data Mining of Scientific Literature. Applications are invited from UK, EU and international students. KMi is located in Milton Keynes, 30 minutes by train from London.

 

Читать далее

Рубрика: Вакансии/Стажировки, Курсы/Образование/Постдоки | Добавить комментарий

1st CfP: Slavic Corpus Linguistics Conference, September 2018 in Prague

Call for papers: SlaviCorp 2018

to be held in Prague, Czech Republic on 24 — 26 September 2018

Conference website: http://slavicorp.ff.cuni.cz/

Conference topics
The conference aims to cover a wide range of topics related to research on Slavic languages:
— corpus research on any Slavic language, including also contrastive topics using parallel or comparable corpora;
— development of Slavic language resources: corpora, lexica, annotation, tools;
— applications of Slavic language resources for language technologies.

The official language of the conference is English.

Programme
The main conference will consist of plenary lectures and regular presentations of full papers. Full papers will be allowed 20 minutes for presentation and 10 minutes for discussion. Details about the plenaries will be published soon.

The conference will be accompanied by the workshop on language variability and multi-dimensional analysis in the form of a panel of experts involved in variability analyses, including a lecture delivered by Prof. Douglas Biber. The workshop will be free and open to all conference participants.

Venue
The conference will be held at the Faculty of Law, Charles University, Prague. The venue is in the city centre, overlooking the Vltava River, within easy reach by public transport.

Submissions
We invite submissions of extended abstracts that reflect any of the conference topics. The extended abstracts should be between 500 and 800 words long (excluding references) and they are to be submitted on-line through EasyChair: https://easychair.org/conferences/?conf=slavicorp2018

The deadline for submission of the abstracts is 18 February 2018. The abstracts will be anonymously peer-reviewed by the Programme Committee, notification of acceptance will be sent out by the end of April 2018.

Contact
The conference is organized by the Institute of the Czech National Corpus, Faculty of Arts, Charles University, Prague.

All relevant information about the conference is available on http://slavicorp.ff.cuni.cz/ where all updates will also be posted.

Рубрика: Конференции | Добавить комментарий