UGO

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Created page with "The UGO (Universal Generation cOrpus) aims at defining the standards for fully-automatic sentence-driven NLization. == Goal == The project UGO has two main goals: #To pr...")
 
(Goal)
Line 4: Line 4:
 
The project UGO has two main goals:
 
The project UGO has two main goals:
 
#To provide a translation memory from UNL to natural language, in order to be used for inducing UNL>NL grammars; and
 
#To provide a translation memory from UNL to natural language, in order to be used for inducing UNL>NL grammars; and
#To provide the standards for fully-automatic sentence-driven NLization.
+
#To provide the standards for fully-automatic sentence-driven NLization, to be used as the parameter for evaluating the precision of UNL>NL grammars.
  
 
== Repository ==
 
== Repository ==

Revision as of 20:17, 12 November 2013

The UGO (Universal Generation cOrpus) aims at defining the standards for fully-automatic sentence-driven NLization.

Contents

Goal

The project UGO has two main goals:

  1. To provide a translation memory from UNL to natural language, in order to be used for inducing UNL>NL grammars; and
  2. To provide the standards for fully-automatic sentence-driven NLization, to be used as the parameter for evaluating the precision of UNL>NL grammars.

Repository

UGO is a repository of UNL graphs depicting basic structures of UNL. It is divided into 6 different subprojects according to the following criteria:

  • UGO-A1 contains 250 simple NP's
  • UGO-A2 contains 250 simple VP's
  • UGO-B1 contains 250 complex NP's
  • UGO-B2 contains 250 complex VP's
  • UGO-C1 contains 250 full sentences
  • UGO-C2 contains 250 full sentences


Repository # of entries[1]
UGO-A1 250
UGO-A2 250
UGO-B1 250
UGO-B2 250
UGO-C1 250
UGO-C2 250

Requisistes

UGO is open to all languages[2] complying with following requisites:

  • UGO-A1 does not have any pre-requisite;
  • UGO-A1 and CORNELIA-A1 are requisites for UGO-A2;
  • UGO-A2 and CORNELIA-A2 are requisites for UGO-B1;
  • UGO-B1 and CORNELIA-B1 are requisites for UGO-B2;
  • UGO-B2 and CORNELIA-B2 are requisites for UGO-C1;
  • UGO-C1 and CORNELIA-C1 are requisites for UGO-C2;


Instructions

In UGO, users are expected to map UNL graphs into natural language sentences. This process must take into consideration the following:

  • The NLization is the generation, to the target language, of the information conveyed by the UNL graph. It defines the expected output of UNL in natural language, and will be used to measure the precision of UNL>NL grammars. The NLization must comply with the principles below:
  • The NLization must convey all and only the information available in the UNL graph, i.e., the NLization must not add or suppress any information;
  • The NLization must be a grammatical sentence of the target language, i.e., it should be syntactically and semantically well-formed;
  • The NLization must belong to the standard variety of the target language, i.e., it should not contain slang, jargon, archaisms, SMS language and other non-standard structures;
  • The NLization must contain punctuation signs only if absolutely necessary or explicitly stated in the UNL graph;
  • A single graph may lead to differnt NLizations, to be provided in separate lines. These may convey different order of constituents, if possible in the target language.

Notes

  1. The numbers are approximate. .
  2. Except English, which was the source for all data
Software