UCA

From UNL Wiki
Jump to: navigation, search

The UC-A is an experimental corpus used to prepare the initial versions of the grammar for sentence-based UNLization and NLization, using IAN and EUGENE, respectively. It comprises two subcorpora: UC-A1 and UC-A2.

Contents

The corpus

  • In one single file (400 sentences):
    • UC-A in English, to be (manually) translated to your target language in order to be used as the input for the UNLization process (with IAN)
    • UC-A in UNL, to be used, "as is" (i.e., without any change), as the input for the NLization process (with EUGENE)
  • According to the general distribution:

Goals

  1. To provide the dictionary and grammars necessary to UNLize your translated version of UC-A (with IAN)
  2. To provide the dictionary and grammars necessary to NLize, to your target language, the UNL version of UC-A (with EUGENE)

Methodology

  1. Prepare the dictionary and grammars to deal with UC-A1 (follow the instructions available at UC-A1)
  2. Prepare the dictionary and grammars to deal with UC-A2 (follow the instructions available at UC-A2)
  3. Merge the corresponding resources and make the necessary changes

Assessment

The actual outputs must be evaluated against the expected outputs using the F-Measure, which can be automatically calcuated at UNLWEB>UNLARIUM>GRAMMAR>[LOCALE]>F-MEASURE

  • UNLization
    • Actual output: the output provided by IAN, in your language, with the resources that you have provided, for the translated version of UC-A
    • Expected output: UC-A in UNL
  • NLization
    • Actual output: the output provided by EUGENE, in your language, with the resources that you have provided, for the input file UC-A in UNL
    • Expected output: the human-translated version of UC-A used as the input for the UNLization

Samples and Examples

See the samples and examples in UC-A1 and UC-A2

Software