The First Workshop on Subword and Character LEvel Models in NLP (SCLeM)

To be held at EMNLP 2017 in Copenhagen on September 7, 2017

Website: https://sites.google.com/view/sclem2017

DATES

Submission deadline: June 2, 2017
Notification: June 30, 2017
Camera ready: July 14, 2017
Workshop: September 7, 2017

INVITED SPEAKERS

Kyunghyun Cho, NYU
Karen Livescu, TTIC
Tomas Mikolov, Facebook
Noah Smith, Univ of Washington

INVITED TUTORIAL TALK

Neural weighted finite-state machines, Ryan Cotterell, JHU

OVERVIEW

Traditional NLP starts with a hand-engineered layer of representation,
the level of tokens or words.  A tokenization component first breaks
up the text into units using manually designed rules. Tokens are then
processed by components such as word segmentation, morphological
analysis and multiword recognition.  The heterogeneity of these
components makes it hard to create integrated models of both structure
within tokens (e.g., morphology) and structure across multiple tokens
(e.g., multi-word expressions). This approach can perform poorly (i)
for morphologically rich languages, (ii) for noisy text, (iii) for
languages in which the recognition of words is difficult and (iv) for
adaptation to new domains; and (v) it can impede the optimization of
preprocessing in end-to-end learning.

The workshop provides a forum for discussing recent advances as well
as future directions on sub-word and character-level natural language
processing and representation learning that address these problems.

TENTATIVE PROGRAM

09:00-09:10  Welcome
09:10-09:50  Invited talk 1
09:50-10:30  Invited talk 2
10:30-11:00  Coffee break
11:00-11:40  Invited tutorial talk
11:40-12:10  Best paper presentations
12:10-14:00  Poster session & Lunch break
14:00-14:40  Invited talk 3
14:40-15:40  Poster session & Coffee break
15:40-16:20  Invited talk 4
16:20-17:30  Panel discussion
17:30-17:45  Closing remarks

TOPICS OF INTEREST

— tokenization-free models
— character-level machine translation
— character-ngram information retrieval
— transfer learning for character-level models
— models of within-token and cross-token structure
— NL generation (of words not seen in training etc)
— out of vocabulary words
— morphology & segmentation
— relationship b/w morphology & character-level models
— stemming and lemmatization
— inflection generation
— orthographic productivity
— form-meaning representations
— true end-to-end learning
— spelling correction
— efficient and scalable character-level models

SUBMISSIONS OF LONG AND SHORT PAPERS AND EXTENDED ABSTRACTS

Please submit your paper using START:https://www.softconf.com/emnlp2017/sclem/

Submissions must be in PDF format, anonymized for review, written in
English and follow the EMNLP 2017 formatting requirements (available
at http://emnlp2017.net/).

We strongly advise you use the LaTeX template files provided by EMNLP 2017.

Long paper submissions consist of up to eight pages of content. Short
paper submissions consist of up to four pages of content. There is no
limit on the number of pages for references. There is no extra space
for appendices. Accepted papers will be given one additional page for
content.

Authors can also submit extended abstracts of up to eight pages of
content.  Add «EXTENDED ABSTRACT)» to the title of an extended
abstract submission. Extended abstracts will be presented as talks or
posters if selected by the program committee, but not included in the
proceedings.  Thus, your work will retain the status of being
unpublished and later submission at another venue (e.g., a journal) is
not precluded.

ORGANIZING COMMITTEE

Manaal Faruqui, Google
Hinrich Schuetze, LMU Munich
Isabel Trancoso, INESC-ID/IST
Yadollah Yaghoobzadeh, LMU Munich

Об авторе Лидия Пивоварова

СПбГУ — старший преподаватель,
University of Helsinki — PhD student

http://philarts.spbu.ru/structure/sub-faculties/itah_phil/teachers/pivovarova

Запись опубликована в рубрике Конференции. Добавьте в закладки постоянную ссылку.

Добавить комментарий

Ваш e-mail не будет опубликован. Обязательные поля помечены *