Grammar Specs

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Redirected page to Grammar)
Line 1: Line 1:
#REDIRECT [[Grammar]]
+
 
 +
 
 +
 
 +
== Basic symbols ==
 +
 
 +
{| border="1" cellpadding="2" align=center
 +
|+Basic symbols used in UNL grammar rules
 +
!Symbol
 +
!Definition
 +
!Example
 +
|-
 +
|align=center|<nowiki>^</nowiki>
 +
|not
 +
|^a = not a
 +
|-
 +
|align=center|{ | }
 +
|or
 +
|<nowiki>{a|b}</nowiki> = a or b
 +
|-
 +
|align=center|%
 +
|index for nodes, attributes and values
 +
|%x (see [[#Indexes|below]])
 +
|-
 +
|align=center|#
 +
|index for sub-NLWs
 +
|#01 (see [[#Indexes|below]])
 +
|-
 +
|align=center|=
 +
|attribute-value assignment
 +
|POS=NOU
 +
|-
 +
|align=center|!
 +
|rule trigger
 +
|!PLR
 +
|-
 +
|align=center|&
 +
|merge operator
 +
|%x&%y
 +
|-
 +
|align=center|?
 +
|dictionary lookup operator
 +
|?[a]
 +
|-
 +
|align=center|“ “
 +
|string
 +
|"went"
 +
|-
 +
|align=center|[ ]
 +
|natural language entry (headword)
 +
|[go]
 +
|-
 +
|align=center|[[ ]]
 +
|UW
 +
|[[to go(icl>to move)]]
 +
|-
 +
|align=center|( )
 +
|node
 +
|(a)
 +
|-
 +
|align=center|//
 +
|regular expression
 +
|/a{2,3}/ = aa,aaa
 +
|}
 +
 
 +
== Basic Concepts ==
 +
 
 +
 
 +
 
 +
 
 +
 
 +
 
 +
 
 +
 
 +
;The differences between "", [] and [[]]
 +
:Double quotes are always used to represent strings: "a" will match only the string "a"
 +
:Simple square brackets are always used to represent natural language entries (headwords) in the dictionary: [a] will match the node associated to the entry [a] retrieved from the dictionary, no matter its current realization, which may be affected by other rules (the original [a] may have been replaced, for instance, by "b", but will still be indexed to the entry [a])
 +
:Double square brackets are always used to represent UWs: <nowiki>[[a]]</nowiki> will match the node associated to the UW <nowiki>[[a]]</nowiki>
 +
 
 +
;Predefined values (assigned by default)
 +
:SCOPE - Scope
 +
:SHEAD - Sentence head (the beginning of a sentence)
 +
:STAIL - Sentence tail (the end of a sentence)
 +
:CHEAD - Scope head (the beginning of a scope)
 +
:CTAIL - Scope tail (the end of a scope)
 +
:TEMP - Temporary entry (entry not found in the dictionary)
 +
:DIGIT - Any sequence of digits (i.e.: 0,1,2,3,4,5,6,7,8,9)

Revision as of 19:29, 16 August 2013


Basic symbols

Basic symbols used in UNL grammar rules
Symbol Definition Example
^ not ^a = not a
{ | } or {a|b} = a or b
% index for nodes, attributes and values %x (see below)
# index for sub-NLWs #01 (see below)
= attribute-value assignment POS=NOU
! rule trigger !PLR
& merge operator %x&%y
? dictionary lookup operator ?[a]
“ “ string "went"
[ ] natural language entry (headword) [go]
[[ ]] UW [[to go(icl>to move)]]
( ) node (a)
// regular expression /a{2,3}/ = aa,aaa

Basic Concepts

The differences between "", [] and [[]]
Double quotes are always used to represent strings: "a" will match only the string "a"
Simple square brackets are always used to represent natural language entries (headwords) in the dictionary: [a] will match the node associated to the entry [a] retrieved from the dictionary, no matter its current realization, which may be affected by other rules (the original [a] may have been replaced, for instance, by "b", but will still be indexed to the entry [a])
Double square brackets are always used to represent UWs: [[a]] will match the node associated to the UW [[a]]
Predefined values (assigned by default)
SCOPE - Scope
SHEAD - Sentence head (the beginning of a sentence)
STAIL - Sentence tail (the end of a sentence)
CHEAD - Scope head (the beginning of a scope)
CTAIL - Scope tail (the end of a scope)
TEMP - Temporary entry (entry not found in the dictionary)
DIGIT - Any sequence of digits (i.e.: 0,1,2,3,4,5,6,7,8,9)
Software