Rule

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Created page with " == Basic symbols == {| border="1" cellpadding="2" align=center |+Basic symbols used in UNL grammar rules !Symbol !Definition !Example |- |align=center|<nowiki>^</nowiki>...")
 
Line 1: Line 1:
  
 +
N-rules (normalization rules) and T-rules (transformation rules) follow the very general formalism
  
 +
α:=β;
 +
 +
where the left side α is a condition statement, and the right side β is an action to be performed over α.
 +
 +
D-rules (disambiguation rules) follow a slightly different formalism:
 +
 +
α=P;
 +
 +
where the left side α is a statement and the right side P is an integer from 0 to 255 that indicates the probability of occurrence of α.
  
  

Revision as of 15:54, 16 August 2013

N-rules (normalization rules) and T-rules (transformation rules) follow the very general formalism

α:=β;

where the left side α is a condition statement, and the right side β is an action to be performed over α.

D-rules (disambiguation rules) follow a slightly different formalism:

α=P;

where the left side α is a statement and the right side P is an integer from 0 to 255 that indicates the probability of occurrence of α.


Basic symbols

Basic symbols used in UNL grammar rules
Symbol Definition Example
^ not ^a = not a
{ | } or {a|b} = a or b
% index for nodes, attributes and values %x (see below)
# index for sub-NLWs #01 (see below)
= attribute-value assignment POS=NOU
! rule trigger !PLR
& merge operator %x&%y
? dictionary lookup operator ?[a]
“ “ string "went"
[ ] natural language entry (headword) [go]
[[ ]] UW [[to go(icl>to move)]]
( ) node (a)
// regular expression /a{2,3}/ = aa,aaa
The differences between "", [] and [[]]
Double quotes are always used to represent strings: "a" will match only the string "a"
Simple square brackets are always used to represent natural language entries (headwords) in the dictionary: [a] will match the node associated to the entry [a] retrieved from the dictionary, no matter its current realization, which may be affected by other rules (the original [a] may have been replaced, for instance, by "b", but will still be indexed to the entry [a])
Double square brackets are always used to represent UWs: [[a]] will match the node associated to the UW [[a]]
Predefined values (assigned by default)
SCOPE - Scope
SHEAD - Sentence head (the beginning of a sentence)
STAIL - Sentence tail (the end of a sentence)
CHEAD - Scope head (the beginning of a scope)
CTAIL - Scope tail (the end of a scope)
TEMP - Temporary entry (entry not found in the dictionary)
DIGIT - Any sequence of digits (i.e.: 0,1,2,3,4,5,6,7,8,9)
Software