English grammar

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Features)
Line 1: Line 1:
The English grammars here presented are used for representing English sentences into UNL ([[UNLization]]) and for generating English sentences from UNL graphs ([[NLization]]). They follow the syntax defined at the [[UNL Grammar Specs]] and have been used for [[IAN]] and [[EUGENE]], i.e., they work on the sentence level. As a still ongoing work, they cover yet a rather small set of constructions, but may be used as strategy to address the constructions of the [[Corpus500]].
+
The English grammars here presented target the [[Corpus500]], and are provided as a didactic sample that may help users to build their own grammars. They are used for representing the English sentences into UNL ([[UNLization]]) and for generating English sentences from UNL graphs ([[NLization]]). They follow the syntax defined at the [[UNL Grammar Specs]] and have been used for [[IAN]] and [[EUGENE]].
 
+
  
 
== Requisites ==
 
== Requisites ==
The grammars here presented depend heavily on the structure of the dictionary presented at [[English dictionary]]. You have to be acquainted with the formalism described at the [[UNL Dictionary Specs]] and the [[Tagset]] in order to fully understand how the grammar deal with the dictionary entry structure. You should also understand the process of [[tokenization]] done by the machine.
+
The grammars here presented depend heavily on the structure of the dictionary presented at [[English dictionary]]. You have to be acquainted with the formalism described at the [[UNL Dictionary Specs]] and the [[Tagset]] in order to fully understand how the grammar deal with the dictionary entry structure. You should also understand the process of [[tokenization]] done by the machine.
  
 
== Features ==
 
== Features ==
Line 15: Line 14:
 
**TEMP = temporary entry (system-defined feature assigned to the strings that are not present in the dictionary)
 
**TEMP = temporary entry (system-defined feature assigned to the strings that are not present in the dictionary)
 
*'''Grammar features''' are features created inside the grammar in any of its intermediate states between the input and the output.
 
*'''Grammar features''' are features created inside the grammar in any of its intermediate states between the input and the output.
 +
All the features are described at the [[Tagset]].
  
 
== EN-UNL (Analysis) Grammar ==
 
== EN-UNL (Analysis) Grammar ==

Revision as of 14:13, 27 July 2012

The English grammars here presented target the Corpus500, and are provided as a didactic sample that may help users to build their own grammars. They are used for representing the English sentences into UNL (UNLization) and for generating English sentences from UNL graphs (NLization). They follow the syntax defined at the UNL Grammar Specs and have been used for IAN and EUGENE.

Contents

Requisites

The grammars here presented depend heavily on the structure of the dictionary presented at English dictionary. You have to be acquainted with the formalism described at the UNL Dictionary Specs and the Tagset in order to fully understand how the grammar deal with the dictionary entry structure. You should also understand the process of tokenization done by the machine.

Features

The grammars play with a set of features that come from three different origins:

  • Dictionary features are the features ascribed to the entries in the dictionary, and appear either as simple attributes (LEX,GEN,NUM), as simple values (N,MCL,SNG) or attribute-value pairs (LEX=N,GEN=MCL,NUM=SNG).
  • System-defined features are features automatically assigned by EUGENE and IAN during the processing. They are the following:
    • SHEAD = beggining of the sentence (system-defined feature assigned automatically by the machine)
    • CHEAD = beginning of a scope (system-defined feature assigned automatically by the machine)
    • STAIL = end of the sentence (system-defined feature assigned automatically by the machine)
    • CTAIL = end of a scope (system-defined feature assigned automatically by the machine)
    • TEMP = temporary entry (system-defined feature assigned to the strings that are not present in the dictionary)
  • Grammar features are features created inside the grammar in any of its intermediate states between the input and the output.

All the features are described at the Tagset.

EN-UNL (Analysis) Grammar

EN-UNL (Analysis) Transformation Grammar

EN-UNL (Analysis) Disambiguation Grammar

UNL-EN (Generation) Grammar

UNL-EN (Generation) Transformation Grammar

UNL-EN (Generation) Disambiguation Grammar

Software