Full Title: Cognitive Aspects of the Lexicon
Short Title: CogALex
Date: 23-Aug-2014 — 23-Aug-2014
Location: Dublin, Ireland
Contact Person: Michael Zock
Meeting Email: < click here to access email >
Web Site: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-webpage/index.html
Linguistic Field(s): Cognitive Science; Computational Linguistics; Lexicography; Psycholinguistics; Semantics; Text/Corpus Linguistics
Call Deadline: 25-May-2014
The aim of the workshop is to bring together researchers involved in the construction and application of electronic dictionaries to discuss modifications of existing resources in line with the users’ needs, thereby fully exploiting the advantages of the digital form. Given the breadth of the questions, we welcome reports on work from many perspectives, including but not limited to: computational lexicography, psycholinguistics, cognitive psychology, language learning and ergonomics.
The way we look at dictionaries (their creation and use) has changed dramatically over the past 30 years. While being considered as an appendix to grammar in the past, by now they have moved to centre stage. Indeed, there is hardly any task in NLP which can be conducted without them. Also, rather than being static entities (data-base view), dictionaries are now viewed as dynamic networks, i.e. graphs, whose nodes and links (connection strengths) may change over time. Interestingly, properties concerning topology, clustering and evolution known from other disciplines (society, economy, human brain) also apply to dictionaries: everything is linked, hence accessible, and everything is evolving. Given these similarities, one may wonder what we can learn from these disciplines.
In this 4th edition of the CogALex workshop we therefore also invite scientists working in these fields, with the goal to broaden the picture, i.e. to gain a better understanding concerning the mental lexicon and to integrate these findings into our dictionaries in order to support navigation. Given recent advances in neurosciences, it appears timely to seek inspiration from neuroscientists studying the human brain. There is also a lot to be learned from other fields studying graphs and networks, even if their object of study is something else than language, for example biology, economy or society.
Topics of Interest:
This workshop is about possible enhancements of lexical resources and electronic dictionaries. To perform the groundwork for the next generation of such resources we invite researchers involved in the building of such tools. The idea is to discuss modifications of existing resources by taking the users’ needs and knowledge states into account, and to capitalize on the advantages of the digital media.
Call for Papers:
For this workshop we solicit papers including but not limited to the following topics, each of which can be considered from various points of view: linguistics, neuro- or psycholinguistics (tip of the tongue problem, associations), network related sciences (sociology, economy, biology), mathematics (vector-based approaches, graph theory, small-world problem), etc.
— Analysis of the conceptual input of a dictionary user
— The meaning of words
— Structure of the lexicon
— Methods for crafting dictionaries or indexes
— Dictionary access (navigation and search strategies, interface issues,…)
— Dictionary applications
We invite participation in a shared task devoted to the problem of lexical access in language production, with the aim of providing a quantitative comparison between different systems.
The quality of a dictionary depends not only on coverage, but also on the accessibility of the information. That is, a crucial point is dictionary access. Access strategies vary with the task and the knowledge available at the very moment of consultation. Unlike readers who look for meanings, writers start from them, searching for the corresponding words. While paper dictionaries are static, permitting only limited strategies for accessing information, their electronic counterparts promise dynamic, proactive search via multiple criteria and via diverse access routes. Navigation takes place in a huge conceptual lexical space, and the results are displayable in a multitude of forms (trees, lists, graphs, or sorted alphabetically, by topic, by frequency).
To bring some structure into this multitude of possibilities, the shared task will concentrate on a crucial subtask, namely multiword association. Suppose, we were looking for a word expressing the following ideas: ‘superior dark coffee made of beans from Arabia’, but could not remember the intended word ‘mocha’. Since people always remember something concerning the elusive word, it would be nice to have a system accepting this kind of input, to propose then a number of candidates for the target word. Given the above example, we might enter dark, coffee, beans, and Arabia, and the system would be supposed to come up with lists of associated words such as mocha, espresso, or cappuccino.
The participants will receive lists of five given words such as ‘circus’, ‘funny’, ‘nose’, ‘fool’, and ‘fun’ and are supposed to compute the word which is most closely associated to all of them. In this case, the word ‘clown’ would be the expected answer.
We will provide a training set of 2000 sets of five input words (multiword stimuli), together with the expected target words. The participants will have several weeks to train their systems on this data. After the training phase, we will release a test set containing another 2000 sets of five input words, but without providing the expected target words.
Participants will have five days to run their systems on the test data, thereby predicting the target words. For each system, we will compare the results to the expected target words and compute an accuracy. The participants will be invited to submit a paper describing their approach and the results.
Schedule for Shared Task:
Deadline for Paper Submission: May 25, 2014
Reviewers’ feedback: June, 15, 2014
Camera-Ready Version: July 7, 2014