Rule

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
m (Examples of T-rules)
 
(7 intermediate revisions by one user not shown)
Line 1: Line 1:
 +
Grammars are sets of rules used to go from UNL into natural language, or from natural language into UNL. In the UNL framework, there can be two different types of rules:
 +
:*[[T-rule]]s, or transformation rules, are used to perform changes to nodes or relations
 +
:*[[D-rule]]s, or disambiguation rules, are used to control changes over nodes or relations
  
N-rules (normalization rules) and T-rules (transformation rules) follow the very general formalism  
+
=== [[T-rule]]s ===
 +
:''main article'':[[T-rule]]
 +
T-rules are used to perform actions and follow the very general formalism  
  
 
  α:=β;
 
  α:=β;
Line 6: Line 11:
 
where the left side α is a condition statement, and the right side β is an action to be performed over α.  
 
where the left side α is a condition statement, and the right side β is an action to be performed over α.  
  
D-rules (disambiguation rules) follow a slightly different formalism:
+
There are several different especial types of T-rules:
 +
*[[A-rule]] is a specific type of T-rule used for affixation (prefixation, infixation, suffixation)
 +
*[[C-rule]] is a specific type of T-rule used for composition (word formation in case of compounds and multiword expressions)
 +
*[[L-rule]] is a specific type of T-rule used for handling word order
 +
*[[N-rule]] is a specific type of T-rule used for segmenting sentences and normalizing the input text
 +
*[[S-rule]] is a specific type of T-rule used for handling syntactic structures
  
α=P;
+
==== Examples of T-rules ====
 +
*PLR:=0>"s"; (A-rule: add "s" in case of plural, as in ''book''>''books'')
 +
*MTW:=+VA("into account",PP); (C-rule: add the prepositional phrase "into account" as an adjunct to the verbal phrase (VA) in order to form the multiword expression, as in ''take''>take ''into account'')
 +
*(ART,%x)(QUA,%y):=(%y)(%x); (L-rule: reverse the order ART+QUA to QUA+ART, as in ''the all''>''all the'')
 +
*("don't"):=("do not"); (N-rule: replace the contraction "don't" by "do not")
 +
*(V,%x)(N,%y):=VC(%x;%y); (S-rule: replace the linear relation between a verb and a noun by the syntactic relation VC between them)
  
where the left side α is a statement and the right side P is an integer from 0 to 255 that indicates the probability of occurrence of α.
+
=== [[D-rule]]s ===
 +
:''main article:'' [[D-rule]]
 +
D-rules are used to control the action of T-rules. They are used to control the dictionary retrieval (in [[tokenization]]) and to prevent or to induce the application of rules in transformation.  
  
 +
D-rules follow the syntax:
  
 +
α=P;
  
== Basic symbols ==
+
where the left side α is a statement and the right side P is an integer from 0 to 255 that indicates the probability of occurrence of α.
 
+
{| border="1" cellpadding="2" align=center
+
|+Basic symbols used in UNL grammar rules
+
!Symbol
+
!Definition
+
!Example
+
|-
+
|align=center|<nowiki>^</nowiki>
+
|not
+
|^a = not a
+
|-
+
|align=center|{ | }
+
|or
+
|<nowiki>{a|b}</nowiki> = a or b
+
|-
+
|align=center|%
+
|index for nodes, attributes and values
+
|%x (see [[#Indexes|below]])
+
|-
+
|align=center|#
+
|index for sub-NLWs
+
|#01 (see [[#Indexes|below]])
+
|-
+
|align=center|=
+
|attribute-value assignment
+
|POS=NOU
+
|-
+
|align=center|!
+
|rule trigger
+
|!PLR
+
|-
+
|align=center|&
+
|merge operator
+
|%x&%y
+
|-
+
|align=center|?
+
|dictionary lookup operator
+
|?[a]
+
|-
+
|align=center|“ “
+
|string
+
|"went"
+
|-
+
|align=center|[ ]
+
|natural language entry (headword)
+
|[go]
+
|-
+
|align=center|[[ ]]
+
|UW
+
|[[to go(icl>to move)]]
+
|-
+
|align=center|( )
+
|node
+
|(a)
+
|-
+
|align=center|//
+
|regular expression
+
|/a{2,3}/ = aa,aaa
+
|}
+
 
+
;The differences between "", [] and [[]]
+
:Double quotes are always used to represent strings: "a" will match only the string "a"
+
:Simple square brackets are always used to represent natural language entries (headwords) in the dictionary: [a] will match the node associated to the entry [a] retrieved from the dictionary, no matter its current realization, which may be affected by other rules (the original [a] may have been replaced, for instance, by "b", but will still be indexed to the entry [a])
+
:Double square brackets are always used to represent UWs: <nowiki>[[a]]</nowiki> will match the node associated to the UW <nowiki>[[a]]</nowiki>
+
  
;Predefined values (assigned by default)
+
==== Examples of D-rules ====
:SCOPE - Scope
+
*(ART)(VER)=0; (there cannot be any article before a verb)
:SHEAD - Sentence head (the beginning of a sentence)
+
*agt(^V,^J;)=0; (the source node of an agent relation must be either a verb or an adjective)
:STAIL - Sentence tail (the end of a sentence)
+
*(D)(N)=1; (determiners may come before nouns)
:CHEAD - Scope head (the beginning of a scope)
+
:CTAIL - Scope tail (the end of a scope)
+
:TEMP - Temporary entry (entry not found in the dictionary)
+
:DIGIT - Any sequence of digits (i.e.: 0,1,2,3,4,5,6,7,8,9)
+

Latest revision as of 20:53, 16 December 2013

Grammars are sets of rules used to go from UNL into natural language, or from natural language into UNL. In the UNL framework, there can be two different types of rules:

  • T-rules, or transformation rules, are used to perform changes to nodes or relations
  • D-rules, or disambiguation rules, are used to control changes over nodes or relations

Contents

T-rules

main article:T-rule

T-rules are used to perform actions and follow the very general formalism

α:=β;

where the left side α is a condition statement, and the right side β is an action to be performed over α.

There are several different especial types of T-rules:

  • A-rule is a specific type of T-rule used for affixation (prefixation, infixation, suffixation)
  • C-rule is a specific type of T-rule used for composition (word formation in case of compounds and multiword expressions)
  • L-rule is a specific type of T-rule used for handling word order
  • N-rule is a specific type of T-rule used for segmenting sentences and normalizing the input text
  • S-rule is a specific type of T-rule used for handling syntactic structures

Examples of T-rules

  • PLR:=0>"s"; (A-rule: add "s" in case of plural, as in book>books)
  • MTW:=+VA("into account",PP); (C-rule: add the prepositional phrase "into account" as an adjunct to the verbal phrase (VA) in order to form the multiword expression, as in take>take into account)
  • (ART,%x)(QUA,%y):=(%y)(%x); (L-rule: reverse the order ART+QUA to QUA+ART, as in the all>all the)
  • ("don't"):=("do not"); (N-rule: replace the contraction "don't" by "do not")
  • (V,%x)(N,%y):=VC(%x;%y); (S-rule: replace the linear relation between a verb and a noun by the syntactic relation VC between them)

D-rules

main article: D-rule

D-rules are used to control the action of T-rules. They are used to control the dictionary retrieval (in tokenization) and to prevent or to induce the application of rules in transformation.

D-rules follow the syntax:

α=P;

where the left side α is a statement and the right side P is an integer from 0 to 255 that indicates the probability of occurrence of α.

Examples of D-rules

  • (ART)(VER)=0; (there cannot be any article before a verb)
  • agt(^V,^J;)=0; (the source node of an agent relation must be either a verb or an adjective)
  • (D)(N)=1; (determiners may come before nouns)
Software