"#input", "Format of the key"=> "#key", "Format of the output" => "#out"); $navLinks = array("Home" => $rootPath, "Events" => $rootPath . "/events/index.php", "Anaphora Resolution Evaluation" => "/events/ARE/index.php", "Data for task 2" => ""); generateTopDocument("Data for ARE: Task 2"); generateMenu($sideLinks, $navLinks, 0); ?>
Data for task 2

The training data for task 2 can be downloaded from here.

For each text there are two files. The first one finishes in -input.xml and constitutes the input text for your program. In the testing stage of ARE the input files will be in this format. The second file finishes in -key.xml and represents the gold standard. It can be used to measure the accuracy of your programs.

Format of the input

The text in which the coreferential chains need to be identified is contained in the <text> tag. The input files have all the entities which can be part of a coreferential chain marked using the <entity> tag. Each tag contains a unique ID. This is an snippet from an input text:

        <p>
          At
          <entity id="18">
            an emergency summit in
            <entity id="19">Toronto</entity>
          </entity>
          ,
          <entity id="20">
            the leaders of
            <entity id="21">both nations</entity>
          </entity>
          agreed to push for
          <entity id="22">
            direct talks with
            <entity id="23">the rebels</entity>
          </entity>
          , even_though
          <entity id="24">they</entity>
          ruled out
          <entity id="229">
            <entity id="26">the guerrillas '</entity>
            non-negotiable demand $--
            <entity id="27">
              freedom for
              <entity id="28">
                <entity id="29">their</entity>
                jailed comrades
              </entity>
            </entity>
          </entity>
          .
        </p>
      
Format of the key

The key file contains <chain> elements which indicate all the elements from a chain. An example of a chain is:

      <chain id="c1">
        <node id="18" value=" an emergency summit in Toronto"/>
        <node id="34" value=" the two-hour closed meeting , held on the 46th day of the hostage crisis"/>
        <node id="67" value=" the summit"/>
        <node id="108" value=" the summit ' s"/>
        <node id="173" value=" the summit"/>
      </chain>
    

The elements which are part of a chain are identified using the <node> tag which has an attribute id indicating the id of the entity which is part of the chain, and an optional attribute value which contains the actual entity. The attribute value is only to improve the legibility of the XML and is not used in the evaluation process.

Format of the output

The format of the output is the same as the format of the key file.