S-rule

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Observations)
(When to use S-rules)
Line 5: Line 5:
 
*[[periphrasis]], i.e., generating analytic grammatical structures, such as in ("love">"will love")
 
*[[periphrasis]], i.e., generating analytic grammatical structures, such as in ("love">"will love")
 
*[[subcategorization]], i.e., defining the number and the type of arguments of a given [[base form]];
 
*[[subcategorization]], i.e., defining the number and the type of arguments of a given [[base form]];
*[[government]], i.e., defining the relationship between a form and its dependants, such as in [[agreement]] and [[case marking]]
+
*[[case marking]], i.e., defining the grammatical cases of the arguments of a given [[base form]];
 +
*[[agreement]], i.e., concord between different parts of phrase;
 
*[[distribution]], i.e., defining the order and adjacency of word forms; and
 
*[[distribution]], i.e., defining the order and adjacency of word forms; and
 
*[[projection]], i.e., projecting syntactic structures out of the constituents.
 
*[[projection]], i.e., projecting syntactic structures out of the constituents.

Revision as of 18:52, 23 March 2010

S-rule (syntactic rule) is the formalism used for describing syntactic structures and syntactic operations in the UNLarium framework.

Contents

When to use S-rules

S-rules are used for:

  • composition, i.e., creating compounds out of the base forms (such as "take">"take into account");
  • periphrasis, i.e., generating analytic grammatical structures, such as in ("love">"will love")
  • subcategorization, i.e., defining the number and the type of arguments of a given base form;
  • case marking, i.e., defining the grammatical cases of the arguments of a given base form;
  • agreement, i.e., concord between different parts of phrase;
  • distribution, i.e., defining the order and adjacency of word forms; and
  • projection, i.e., projecting syntactic structures out of the constituents.

When not to use S-rules

S-rules are not used for for affixation (prefixation, infixation, suffixation) or spelling changes, which must be addressed by A-rules and Ph-rules, respectively.

Types of S-rules

There are four types of S-rules:

  • Head extension extends a given head.
  • Specification creates a specifier (determiner) to the head;
  • Complementation creates a complement (object) to the head; and
  • Adjunction creates an adjunct (modifier) to the head.

For further information on the constituents "head", "specifier", "complement" and "adjunct", see Syntax.

Syntax

S-rules comply with the following syntax:

CONDITION := RELATION(HEAD; ARGUMENT);

Where

  • CONDITION (optional) is a tag or list of tags, extracted from the UNDLF Tagset, that indicates when the rule should be applied. It may also be a relation or a list of relations in case of projection rules. The condition must be omitted in case of general rules (i.e., when the rule is always applied).
  • RELATION is the syntactic relation, extracted from the syntactic roles, between the head and its argument. An S-rule may comprise several different relations.
  • HEAD (optional) is the head of the syntactic structure, which is to be omitted when does not undergo any change;
  • ARGUMENT (optional in case of head-only relations) is the argument (the specifier, the complement or the adjunct) of the head.

The HEAD and the ARGUMENT may be expressed as:

  • a "string" (strings must come between parentheses);
  • a [lemma] (lemmas must come between square brackets);
  • a feature or a set of features, separated by comma, and extracted from the the UNDLF Tagset;
  • a direction (">",">>","<","<<");
  • an index between the left and the right side of the rule (to be specified by the syntax %name);
  • an action, to be performed through an A-rule; and
  • a syntactic relation itself.

Examples

Examples of S-rules:

  • composition
    • VA("into account"); (add the string "into account" as the adjunct of the verb)
  • periphrasis
    • FUT:=IH([will]); (add the lemma "will" as the head of the inflectional phrase in case of future)
  • subcategorization
    • VC(PP([in])); (the complement of the verb is a prepositional phrase headed by the preposition "in")
  • agreement
    • VS(ANUM,APER); (the specifier of the verb assigns number (ANUM) and person (APER) to its head
  • case marking
    • VS(NOM); (the specifier of the verb receives the case nominative (NOM)
  • distribution
    • VA(>>); (the adjunct of the verb comes at the right side of the verb after a blank space)
  • projection
    • VS(%head;%spec)VB(%head;%comp):=VP(VB(%head;%comp);%spec); (integrate the two relations on the left side into a single relation)

X-bar Structure

According to the S-rule syntax, the basic x-bar structure can be represented as follows:

XP(XB(XB(head;complement);adjunct);spec)

For simplification reasons, the same structure may be represented by five head-driven relations, as follows:

XS(head;specifier), which describes the relation between the head of the structure and its specifier
XA(head;adjunct), which describes the relation between the head of the structure and its adjuncts
XC(head;complement), which describes the relation between the head of the structure and its complements
XH(head), which describes the head of the structure
XP(head), which describes the head of the structure without any reference to its internal structure

This is to say that:

XP(XB(XB(head;complement);adjunct);spec) := XS(head;specifier)XA(head;adjunct)XC(head;complement)
XS(head;specifier)XA(head;adjunct)XC(head;complement) := XP(XB(XB(head;complement);adjunct);spec) 

Where X must be replaced by one of the eight possible heads (N, P, V, A, J, C, D, I).

Observations

Relations must not be separated by "," in complex S-rules
VS("b")VC("c")VA("d");
VS("b"),VC("c"),VA("d");
Order is not important between relations in complex S-rules
VS("b")VC("c")VA("d") is the same as VC("c")VA("d")VS("b") or VA("d")VC("c")VS("b")
Order is essential between arguments of the same relation
VA("a";"b"); VA("b";"a");
Relations are always binary (but the head may be omitted if does not undergo any change)
VA("a";"b");
VA("a");
VA("a";"b";"c");
Arguments of relations may be expressed by the right side of A-rules (i.e., by prefixation, infixation or suffixation).
VA(0>"a"); (the verbal adjuncts, if any, receive an "a" as suffix)
Strings are used to create new nodes, in case of no indexation, or to replace the existing ones, in case of indexation between the left and the right side of the rule
VA("a"); (creates the node "a" as the adjunct of a verb)
VA("c",$anylabel):=VA("a",$anylabel); (the node "c" is replaced by the node "a")
Relations may be deleted through "-" or by replacement by nothing;
VC("a";"b"):=-VC("a";"b"); (deletes the VC relation between "a" and "b"; the nodes "a" and "b" are preserved, if part of any other relation, or deleted, otherwise)
VC("a";"b"):=; (the same)
Arguments may be deleted through replacement by head-only relations;
VC("a";"b"):=VH("a"); (the node "b" is deleted, if not part of any other relation)
Heads may be deleted through "-" or by replacement by nothing;
VH("a"):=-VH("a");
VH("a"):=;
Strings are represented between quotes whereas lemmas are represented between brackets
VA("into account"); (add the string "into account" as a verbal adjunct, take > take into account)
VC([love]); (add the lemma "love" as a verbal complement, such as in make > make love)

The difference between strings and lemmas has to do with the dictionary status. Lemmas, but not strings, are expected to be defined as dictionary entries. In the above, it's unlikely to have "into account" as a single entry, whereas "love" is probably already there.

Formal Syntax

S-rules comply with the following formal syntax:

<S-RULE>                ::= <CONDITION> ":=" (<SYNTACTIC RELATION>)+";"
<CONDITION>             ::= <TAG>(","<TAG>)* | (<SYNTACTIC RELATION>)*
<SYNTACTIC RELATION>    ::= <HEAD-DRIVEN RELATION> "(" (<NODE>";")? <NODE> ")"
<HEAD-DRIVEN RELATION>  ::= {one of the head-driven syntactic relations defined in the UNDLF Tagset} 
<NODE>                  ::= <FEATURE>(","<FEATURE>)* 
<FEATURE>               ::= <ID>|<TAG>|"""<STRING>"""|"["<STRING>"]"|<DIRECTION>|<SYNTACTIC RELATION>|<ACTION>
<ID>                    ::= "%"[a-zA-Z_0-9]+
<TAG>                   ::= {one of the tags defined in the UNDLF Tagset}
<STRING>                ::= [a..Z]+
<DIRECTION>             ::= ">"|">>"|"<"|"<<"
<ACTION>                ::= <PREFIXATION> | <SUFFIXATION> | <INFIXATION> | <REPLACEMENT> (cf. A-rule)

where
<a> = a is a non-terminal symbol
"a" = a is a constant
a | b = a or b
(a)? = a can be repeated 0 or one time
(a)* = a can be repeated 0 or more times
(a)+ = a can be repeated 1 or more times

Software