shared task in information extraction

Text Analysis Conference
Knowledge Base Population 2014

Evaluation: February-November, 2014
Workshop: November 17-18, 2014
Conducted by:
U.S. National Institute of Standards and Technology (NIST)

With support from:
U.S. Department of Defense


The Text Analysis Conference (TAC) is a series of evaluations and
workshops organized to promote research in Natural Language Processing
and related applications, by providing a large test collection, common
evaluation procedures, and a forum for organizations to share their

The goal of TAC Knowledge Base Population (KBP) is to develop and
evaluate technologies for building and populating knowledge bases
(KBs) from unstructured text. KBP systems must ultimately build a KB
from scratch, but must also be able to populate an existing reference
KB that has incomplete or unkown provenance.

You are invited to participate in TAC KBP 2014. Organizations may
choose to participate in any or all of the TAC KBP 2014 tracks. NIST
provides test data for each KBP task, and participants run their NLP
systems on the data and return their results to NIST for evaluation.
TAC KBP culminates in a November workshop at NIST in Gaithersburg,
Maryland, USA.

All results submitted to NIST are archived on the TAC web site, and
all evaluations of submitted results are included in the workshop
proceedings. Dissemination of TAC work and results other than in the
workshop proceedings is welcomed, but the conditions of participation
specifically preclude any advertising claims based on TAC results.


1)  Cold Start KBP
The Cold Start track builds a knowledge base from scratch.

2)   Entity Linking
The entity linking task is to discover and link names in a document
collection to entities in a reference KB, or to new named entities
discovered in the document collection.

3)  Slot Filling
The slot filling task is to search a document collection to fill in
values for predefined slots (attributes) for a given entity in a
reference KB.

4)  Slot Filler Validation
The Slot Filler Validation track focuses on the refinement of output
from slot filling systems by either combining information from
multiple slot filling systems, or applying more intensive linguistic
processing to validate individual candidate slot fillers.

5)  Sentiment
The goal of the Sentiment track is to assess the quality of detectors
for scoped and attributed sentiment.

6)  Event
The goal of the Event track is to extract information about events
such that the information would be suitable as input to a knowledge


1) Event track for identifying events from a predefined ontology and
extracting their arguments from text
2) English entity DISCOVERY and linking task
3) Cross-lingual Spanish and Chinese entity linking over discussion forums
3) Multi-document provenance and inference for slot filling and Cold Start KBP
4) Cold Start task variant providing evaluation queries in advance
(similar to slot filling)


Organizations wishing to participate in any of the TAC KBP 2014 tracks
are invited to register online by June 15, 2014. Participants are
advised to register and submit all required agreement forms as soon as
possible in order to receive timely access to evaluation resources,
including any sample and training data. Registration for a track does
not commit you to participating in the track, but is helpful to know
for planning. Late registration will be permitted only if resources
allow. Any questions about conference participation may be sent to the
TAC project manager:

Track registration:


The TAC 2014 workshop will be held November 17-18, 2014, in
Gaithersburg, Maryland, USA. The workshop is a forum both for
presentation of results (including failure analyses and system
comparisons), and for more lengthy system presentations describing
techniques used, experiments run on the data, and other issues of
interest to NLP researchers. KBP track participants who wish to give a
presentation during the workshop will submit a short abstract in
September describing the experiments they performed. As there is a
limited amount of time for oral presentations, the abstracts will be
used to determine which participants are asked to speak and which will
present in a poster session.


March    Initial track guidelines posted
April    Distribution of document collections
June 15    Deadline for registration for track participation
July — September    Track evaluation windows (varies by track)
September 30    Deadline for short system descriptions
September 30    Deadline for workshop presentation proposals
By October    Release of individual evaluated results to participants
(varies by track)
mid October    Notification of acceptance of presentation proposals
November 1    Deadline for system reports (workshop notebook version)
November 17-18    TAC 2014 workshop in Gaithersburg, Maryland, USA
February 15, 2015    Deadline for system reports (final proceedings version)


Claire Cardie (Cornell University)
Hoa Trang Dang (U.S. National Institute of Standards and Techonology)
Jason Duncan (U.S. Department of Defense)
Joe Ellis (Linguistic Data Consortium)
Marjorie Freedman (BBN Technologies)
Kira Griffitt (Linguistic Data Consortium)
Ralph Grishman (New York University)
Yasaman Haghpanah (U.S. National Institute of Standards and Techonology)
Heng Ji (Rensselaer Polytechnic Institute)
James Mayfield (Johns Hopkins University)
Boyan Onyshkevych (U.S. Department of Defense)
Stephanie Strassel (Linguistic Data Consortium)
Mihai Surdeanu (University of Arizona)

