S-rule

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Syntax)
(Properties)
 
(104 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''S-rule''' (syntactic rule) is the formalism used for describing syntactic hierarchies and syntactic operations in the UNL<sup>arium</sup> framework.
+
'''S-rule''' (syntactic/semantic rule) is a specific type of [[transformation rule]] used for dealing with [[syntactic relations]] and [[semantic relations]] in the UNL framework.
 +
 
 
== When to use S-rules ==
 
== When to use S-rules ==
S-rules are used for:
+
S-rules are used for altering, replacing, creating and deleting non-linear relations.
*creating compounds out of the [[base form]]s (such as "take">"take into account");
+
*generating complex grammar structures (such as "love">"will love");
+
*defining syntactic roles (such as "subject", "object", etc);
+
*defining dependency relations (such as agreement);
+
*defining the distribution (i.e., order and adjacency) of word forms; and
+
*projecting syntactic structures out of the constituents.
+
  
 
== When not to use S-rules ==
 
== When not to use S-rules ==
S-rules are not used for for affixation (prefixation, infixation, suffixation) or spelling changes, which are dealt by [[A-rule]]s and [[Ph-rule]]s, respectively.
+
S-rules are not used for for linear relations (such as affixation, string manipulation and list manipulation, which must be addressed by [[A-rule]]s, [[N-rule]]s and [[L-rule]]s, respectively).
 +
 
 
== Types of S-rules ==
 
== Types of S-rules ==
There are four types of S-rules:
+
{{:Transformation over relations}}
*'''Head extension''' extends a given head.
+
*'''Specification''' creates a specifier (determiner) to the head;
+
*'''Complementation''' creates a complement (object) to the head;
+
*'''Adjunction''' creates an adjunct (modifier) to the head; and
+
  
== Syntax ==
+
== Properties ==
S-rules comply with the following syntax:<br>
+
#S-rules always end in ";"
CONDITION := RELATION(HEAD; ARGUMENT);
+
#*rel("a");
Where
+
#*<strike>rel("a")</strike>
*CONDITION (optional) is a tag or list of tags, extracted from the [[tagset|UNDLF Tagset]], or a relation or list relations, that indicates when the rule should be applied, and that it's to be omitted if always applied;
+
#Relations are n-ary, i.e., they may have as many arguments as necessary, isolated by semicolon (";")
*RELATION is the syntactic relation, extracted from the [[Syntax#Syntactic_Roles|syntactic roles]], between the head and its argument;
+
#*rel("a"); (relation with one argument)
*HEAD (optional) is the head of the syntactic structure, which is to be omitted when does not undergo any change;
+
#*rel("a";"b"); (relation with two arguments)
*ARGUMENT (optional in case of head-only relations) is the argument (the specifier, the complement or the adjunct) of head.
+
#*rel("a";"b";"c"); (relation with three arguments)
The HEAD and the ARGUMENT may be expressed as:
+
#*etc.
*a "string" (strings must come between parentheses);
+
#:Syntactic relations are not predefined, although we have been using a set of binary relations based on the [[syntactic relations|X-bar theory]].
*a [lemma] (lemmas must come between square brackets);
+
#:Semantic relations constitute a predefined and closed set that can be found [[Universal Relations|here]].
*a feature or a set of features, separated by comma, and extracted from the the [[tagset|UNDLF Tagset]];
+
#Inside each relation, nodes may be referenced by any of its elements, isolated by comma (,):
*a phrase (NP, VP, PP, etc), extracted from the [[Syntax#Syntactic_Roles|syntactic roles]];
+
#:VC(%a;%b) - syntactic relation between a node where index = %a and another node where index = %b
*a direction (">",">>","<","<<");
+
#:agt("a",[a],<nowiki>[[a]]</nowiki>,A;"b",[b],<nowiki>[[b]]</nowiki>,B) - semantic relation between a node having the feature A where string = "a" AND headword <nowiki>[a]</nowiki> AND UW = <nowiki>[[a]]</nowiki> AND another node having the feature B where string = "b" AND headword = [b] AND UW = <nowiki>[[b]]</nowiki>
*a variable (to be specified both in the left and in the right side of the rule by the syntax %name)
+
#The arguments of a relation may be empty in case they are not affected by S-rules.
*an action, to be performed through an [[A-rule]]
+
#:rel(;):=rel2(;); (replace all relations rel by rel2, regardless of their arguments)
 +
#Relations may be conjoined through juxtaposition:
 +
#:agt(%x;%y)obj(%x;%z) - two semantic relations: one between (%x) and (%y) AND other between (%x) and (%z)
 +
#:<strike>VC([a];[b]),VC([a];[c])</strike> - conjoined relations must not be isolated by comma
 +
#Relations may be disjoined through {braces}
 +
#:{("a")|("b")}("c") - either ("a")("c") or ("b")("c")
 +
#:{agt(%x;%y)|exp(%x;%y)}obj(%x;%z) - either agt(%x;%y)obj(%x;%z) or exp(%x;%y)obj(%x;%z)
 +
#Order is not important between relations, but essential between arguments of the same relation
 +
#:rel1("b")rel2("c")rel3("d") = rel2("c")rel3("d")rel1("b") = rel3("d")rel2("c")rel1("b")
 +
#:rel1("a";"b"); '''&ne;''' rel1("b";"a");
 +
#Relations may be replaced by regular expressions
 +
#:/.{2,3}/(%x;%y) - any relation made of two or three characters between %x and %y
 +
#Arguments of relations may be expressed by [[A-rule]]s, but only in the right side of rules
 +
#:rel("a"):=rel("an"); or rel("a"):=rel(0>"n");
 +
#S-rules do not affect nodes unless explicitly informed
 +
#:rel("a",[a],[[a]],A,%x;"b",[b],<nowiki>[[b]]</nowiki>,B,%y):=rel2(%x;%y); (the nodes %x and %y do not undergo any change)
 +
#:rel("a",[a],[[a]],A,%x;"b",[b],<nowiki>[[b]]</nowiki>,B,%y):=rel2(%x);  (the node %x does not undergo any change; the node %y is deleted)
 +
#:rel("a",[a],[[a]],A,%x;"b",[b],<nowiki>[[b]]</nowiki>,B,%y):=rel2(%x,-A;%y); (the feature A is removed from the node %x; all the rest, including the node %y, does not undergo any change)
 +
#"^" is used for negation
 +
#:rel1(%x;%y)^rel2(%y;%z):=+rel2(%y;%z); (if there is a rel1 between the nodes %x and %y and there is no relation rel2 between the nodes %y and %z, create a new relation rel2 between the nodes %y and %z)
 +
 
 +
== Indexes ==
 +
See [[Indexation]]
  
 
== Examples ==
 
== Examples ==
 
Examples of S-rules:
 
Examples of S-rules:
*word-formation
+
*composition
**VA("into account"); (add the string "into account" as the adjunct of the verb)
+
**VA("into account",PP); (add the PP "into account" as the adjunct of the verb)
*compound tenses:
+
*subcategorization
**FUT:=IH([will]); (add the lemma "will" as the head of the inflectional phrase)
+
**VC(PH([in])); (the complement of the verb is a prepositional phrase headed by the preposition "in")
*government
+
**VC(PP([in])); (the complement of the verb must be introduced by the preposition "in")
+
 
*agreement
 
*agreement
 
**VS(ANUM,APER); (the specifier of the verb assigns number (ANUM) and person (APER) to its head
 
**VS(ANUM,APER); (the specifier of the verb assigns number (ANUM) and person (APER) to its head
 +
*case marking
 +
**VS(NOM); (the specifier of the verb receives the case nominative (NOM)
 
*distribution
 
*distribution
**VA(>>); (the adjunct of the verb comes at the right side after a blank space)
+
**VA(>>); (the adjunct of the verb comes at the right side of the verb after a blank space)
 +
*adjacency
 +
**VA(AJ2); (the adjunct of the verb integrates the second projection of the head)
 +
*periphrasis
 +
**VH(%vh,FUT):=+IC([will];%vh,+INF);
 
*projection
 
*projection
**VS(%head;%spec)VB(%head;%comp):=VP(VB(%head;%comp);%spec);
+
**VS(%head;%spec)VB(%head;%comp):=VP(VB(%head;%comp);%spec); (integrate the two relations on the left side into a single relation)
 +
*mapping
 +
**agt(%source;%target):=VS(%source;%target); (the agent relation is mapped into a VS relation)
 +
 
 +
== Formal Syntax ==
 +
S-rules comply with the following formal syntax:
 +
 
 +
<<nowiki>S-RULE</nowiki>>                ::= <CONDITION> ":=" (<RELATION>)+";"
 +
<CONDITION>            ::= <TAG>(","<TAG>)* | (<RELATION>)*
 +
<RELATION>              ::= <SYNTACTIC RELATION> | <SEMANTIC RELATION>
 +
<SEMANTIC RELATION>    ::= <UNL RELATION> "(" <NODE> ";" <NODE> ")"
 +
<SYNTACTIC RELATION>    ::= <NL RELATION> "(" (<NODE>";")? <NODE> ")"
 +
<UNL RELATION>          ::= {one of the head-driven semantic relations defined in the [[Universal Relations|UNL Specs]]} 
 +
<NL RELATION>          ::= {one of the head-driven syntactic relations defined in the [[Tagset|UNDLF Tagset]]}
 +
<NODE>                  ::= <FEATURE>(","<FEATURE>)*
 +
<FEATURE>              ::= <ID>|<TAG>|"""<STRING>"""|"["<STRING>"]"|<DIRECTION>|<SYNTACTIC RELATION>|<ACTION>
 +
<ID>                    ::= "%"[a-zA-Z_0-9]+
 +
<TAG>                  ::= {one of the tags defined in the [[Tagset|UNDLF Tagset]]}
 +
<STRING>                ::= [a..Z]+
 +
<DIRECTION>            ::= ">"|">>"|"<"|"<<"
 +
<ACTION>                ::= <PREFIXATION> | <SUFFIXATION> | <INFIXATION> | <REPLACEMENT> (cf. [[A-rule]])
 +
where<br />
 +
<a> = a is a non-terminal symbol<br />
 +
"a" = a is a constant<br />
 +
a | b = a or b<br />
 +
(a)? = a can be repeated 0 or one time<br />
 +
(a)* = a can be repeated 0 or more times<br />
 +
(a)+ = a can be repeated 1 or more times<br />

Latest revision as of 19:33, 24 June 2014

S-rule (syntactic/semantic rule) is a specific type of transformation rule used for dealing with syntactic relations and semantic relations in the UNL framework.

Contents

When to use S-rules

S-rules are used for altering, replacing, creating and deleting non-linear relations.

When not to use S-rules

S-rules are not used for for linear relations (such as affixation, string manipulation and list manipulation, which must be addressed by A-rules, N-rules and L-rules, respectively).

Types of S-rules

Relations are altered, replaced, created and deleted by S-rules:

Altering nodes in a relation

Elements of nodes in relations are altered through the operators + (add) and - (delete). The operator + may be omitted.

  • rel(%x,A;%y,B):=rel(%x,+C;%y,+D); (add the feature C to %x and D to %y)
  • rel(%x,A;%y,B):=rel(%x,C;%y,D);(the same as above)
  • rel(%x,A;%y,B):=rel(%x,-A;%y); (delete the feature A from %x)

"strings", [headwords] and [[UWs]] are considered to be features (but a single node may have only one of each)

  • rel(%x;%y):=rel(%x,"a";%y); (replace the existing string in %x, if any, by "a")
  • rel(%x;%y):=rel(%x,[A];%y);(replace the existing headword in %x, if any, by [A])
  • rel(%x;%y):=rel(%x,[[A]];%y); (replace the existing UW in %x, if any, by [[A]])

Creating nodes in a relation

Nodes are created when they are not co-indexed to any node in the left side (see Indexation):

  • rel(%x,A;%y,B):=rel(%x;%y;%z,+A); (the node %z, with the feature A, is created as a new argument of the relation rel)

Deleting nodes in a relation

Nodes are deleted when they are not co-indexed to any node in the right side (see Indexation):

  • rel(%x,A;%y,B;%z,C):=rel(%x;%y); (the node %z is deleted as an argument of the relation rel)

Nodes are completelly deleted if, and only if, they are not part of any other relation

Creating relations

Relations are created by the operator + (add) before the relation to be created. This operator may not be omitted.

  • rel(%x;%y):=+rel2(%x;%z); (a new relation rel2 is created between the nodes %x and %z; the original relation is not altered)

Creation of relations is a possible source of infinite loops. In order to prevent the rule from applying eternally, the condition field must be controlled:

  • rel(%x;%y)^rel2(%x;%z):=+rel2(%x;%z);

Deleting relations

Relations are deleted when they are not repeated in the right side, except in case of +

  • rel(%x;%y):=; (the relation rel between the nodes %x and %y is deleted)
  • rel(%x;%y):=rel2(%x;%y); (the relation rel between %x and %y is deleted and a new relation rel2 is created in its place) (replacement)
  • rel(%x;%y):=+rel2(%x;%y); (the relation rel is preserved and a new relation rel2 is created) (creation)

Replacing relations

Relations in the left side are replaced by relations in the right side, except in case of +:

  • rel(%x;%y):=rel2(%x;%y); (the relation rel between %x and %y is deleted and a new relation rel2 is created in its place)
  • rel1(%x;%y)rel2(%y;%z):=rel3(%x;%z); (the relations rel1 and rel2 are deleted and a new relation rel3 is created in their place) (merge)
  • rel(%x;%y):=rel1(%x;%y)rel2(%y;%z); (the relation rel is deleted and two new relations rel1 and rel2 are created in its place) (divide)
  • (%x)(%y):=rel(%x;%y); (the linear relation between the nodes %x and %y is replaced by the non-linear relation rel between the same nodes)
  • L(%x;%y):=rel(%x;%y); (the same as above)

Properties

  1. S-rules always end in ";"
    • rel("a");
    • rel("a")
  2. Relations are n-ary, i.e., they may have as many arguments as necessary, isolated by semicolon (";")
    • rel("a"); (relation with one argument)
    • rel("a";"b"); (relation with two arguments)
    • rel("a";"b";"c"); (relation with three arguments)
    • etc.
    Syntactic relations are not predefined, although we have been using a set of binary relations based on the X-bar theory.
    Semantic relations constitute a predefined and closed set that can be found here.
  3. Inside each relation, nodes may be referenced by any of its elements, isolated by comma (,):
    VC(%a;%b) - syntactic relation between a node where index = %a and another node where index = %b
    agt("a",[a],[[a]],A;"b",[b],[[b]],B) - semantic relation between a node having the feature A where string = "a" AND headword [a] AND UW = [[a]] AND another node having the feature B where string = "b" AND headword = [b] AND UW = [[b]]
  4. The arguments of a relation may be empty in case they are not affected by S-rules.
    rel(;):=rel2(;); (replace all relations rel by rel2, regardless of their arguments)
  5. Relations may be conjoined through juxtaposition:
    agt(%x;%y)obj(%x;%z) - two semantic relations: one between (%x) and (%y) AND other between (%x) and (%z)
    VC([a];[b]),VC([a];[c]) - conjoined relations must not be isolated by comma
  6. Relations may be disjoined through {braces}
    {("a")|("b")}("c") - either ("a")("c") or ("b")("c")
    {agt(%x;%y)|exp(%x;%y)}obj(%x;%z) - either agt(%x;%y)obj(%x;%z) or exp(%x;%y)obj(%x;%z)
  7. Order is not important between relations, but essential between arguments of the same relation
    rel1("b")rel2("c")rel3("d") = rel2("c")rel3("d")rel1("b") = rel3("d")rel2("c")rel1("b")
    rel1("a";"b"); rel1("b";"a");
  8. Relations may be replaced by regular expressions
    /.{2,3}/(%x;%y) - any relation made of two or three characters between %x and %y
  9. Arguments of relations may be expressed by A-rules, but only in the right side of rules
    rel("a"):=rel("an"); or rel("a"):=rel(0>"n");
  10. S-rules do not affect nodes unless explicitly informed
    rel("a",[a],a,A,%x;"b",[b],[[b]],B,%y):=rel2(%x;%y); (the nodes %x and %y do not undergo any change)
    rel("a",[a],a,A,%x;"b",[b],[[b]],B,%y):=rel2(%x); (the node %x does not undergo any change; the node %y is deleted)
    rel("a",[a],a,A,%x;"b",[b],[[b]],B,%y):=rel2(%x,-A;%y); (the feature A is removed from the node %x; all the rest, including the node %y, does not undergo any change)
  11. "^" is used for negation
    rel1(%x;%y)^rel2(%y;%z):=+rel2(%y;%z); (if there is a rel1 between the nodes %x and %y and there is no relation rel2 between the nodes %y and %z, create a new relation rel2 between the nodes %y and %z)

Indexes

See Indexation

Examples

Examples of S-rules:

  • composition
    • VA("into account",PP); (add the PP "into account" as the adjunct of the verb)
  • subcategorization
    • VC(PH([in])); (the complement of the verb is a prepositional phrase headed by the preposition "in")
  • agreement
    • VS(ANUM,APER); (the specifier of the verb assigns number (ANUM) and person (APER) to its head
  • case marking
    • VS(NOM); (the specifier of the verb receives the case nominative (NOM)
  • distribution
    • VA(>>); (the adjunct of the verb comes at the right side of the verb after a blank space)
  • adjacency
    • VA(AJ2); (the adjunct of the verb integrates the second projection of the head)
  • periphrasis
    • VH(%vh,FUT):=+IC([will];%vh,+INF);
  • projection
    • VS(%head;%spec)VB(%head;%comp):=VP(VB(%head;%comp);%spec); (integrate the two relations on the left side into a single relation)
  • mapping
    • agt(%source;%target):=VS(%source;%target); (the agent relation is mapped into a VS relation)

Formal Syntax

S-rules comply with the following formal syntax:

<S-RULE>                ::= <CONDITION> ":=" (<RELATION>)+";"
<CONDITION>             ::= <TAG>(","<TAG>)* | (<RELATION>)*
<RELATION>              ::= <SYNTACTIC RELATION> | <SEMANTIC RELATION>
<SEMANTIC RELATION>     ::= <UNL RELATION> "(" <NODE> ";" <NODE> ")"
<SYNTACTIC RELATION>    ::= <NL RELATION> "(" (<NODE>";")? <NODE> ")"
<UNL RELATION>          ::= {one of the head-driven semantic relations defined in the UNL Specs}  
<NL RELATION>           ::= {one of the head-driven syntactic relations defined in the UNDLF Tagset} 
<NODE>                  ::= <FEATURE>(","<FEATURE>)* 
<FEATURE>               ::= <ID>|<TAG>|"""<STRING>"""|"["<STRING>"]"|<DIRECTION>|<SYNTACTIC RELATION>|<ACTION>
<ID>                    ::= "%"[a-zA-Z_0-9]+
<TAG>                   ::= {one of the tags defined in the UNDLF Tagset}
<STRING>                ::= [a..Z]+
<DIRECTION>             ::= ">"|">>"|"<"|"<<"
<ACTION>                ::= <PREFIXATION> | <SUFFIXATION> | <INFIXATION> | <REPLACEMENT> (cf. A-rule)

where
<a> = a is a non-terminal symbol
"a" = a is a constant
a | b = a or b
(a)? = a can be repeated 0 or one time
(a)* = a can be repeated 0 or more times
(a)+ = a can be repeated 1 or more times

Software