Distribution

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Natural Language)
 
(27 intermediate revisions by one user not shown)
Line 1: Line 1:
'''Distribution''' (or word order) refers to the study of the order of the syntactic constituents of a language.
+
'''Distribution''' or '''precedence''' refers to the study of the order of the syntactic constituents of a language. In the UNL<sup>arium</sup> framework, the distribution is informed in the grammar, if general, or in the dictionary, in case of exceptions or categories that do not follow a regular distributional pattern (such as English adverbs). Distribution is not informed in UNL.
  
== Natural Language ==
+
== Values ==
In the UNLarium framework, the distribution of a form is informed in the grammar, if general, or in the dictionary, in case of compounds and modifiers (articles, determiners, adjectives and adverbs) that do not follow a regular pattern.
+
In the UNL<sup>arium</sup> framework, distribution may assume the following values:
  
In English, for instance, articles are always premodifiers. Therefore, distribution of articles must not be informed in the dictionary, but stated through a rule in the grammar. The same applies to determiners (such as "this") and ordinary adjectives (such as "beautiful"), whose behaviour is default: adjectives and determiners are normally premodifiers. Only exceptions to the general rule - free order adjectives such as "possible":
 
"it is the only solution possible" or "it is the only possible solution" - must be treated in the dictionary. However, adverbs, in English,
 
may be premodifiers or postmodifiers, and it is quite difficult to predict their behaviour. Therefore, distribution of adverbs must be informed in the dictionary, and not in the grammar.
 
 
Distribution is also to be informed in the dictionary in case of [[composition|compounds]] such as "bring home the bacon" which are generated from a base form ("bring").
 
 
=== Representing distribution in the dictionary ===
 
 
{{#tree:id=DIS|openlevels=0|root=Distribution (DIS)|
 
{{#tree:id=DIS|openlevels=0|root=Distribution (DIS)|
*order (PSN)
+
*front (FRT): at the beginning of the clause
**front (FRT): at the beginning of the clause
+
*before (BEF): at the left side, before a blank space
**premodifier (BEF): coming before the modified
+
*after (AFT): at the right side, after a blank space
**postmodifier (AFT): coming after the modified
+
*immediately before (IBEF): at the left side, without any blank space
**middle position (MID): coming in the middle of the modified
+
*immediately after (IAFT): at the right side, without any blank space
**end (END): at the end of the clause
+
*middle (MID): coming in the middle
**free (FRE): coming in any position
+
*free (FRE): coming either before or after
*adjacency (PXM)
+
*end (END): at the end of the clause
**immediate (IMM): right after or right before (priority = 0)
+
**near (NEA): precedence over other constituents (except IMM) (priority = 1)
+
**distant (FAR): no precedence over other constituents (priority > 1)
+
 
}}
 
}}
  
==== Examples ====
+
== Dictionary ==
*Order
+
Distribution is to be included in the dictionary in two cases:
**very = BEF - In English, the intensifier "very" is a premodifier: ''He is very rich'' (<strike>''He is rich very''</strike>)
+
*Exceptions to the general distribution rules, such as in some free order adjectives:
**well = AFT - In English, the adverb of manner "well" is a postmodifier: ''He speaks well'' (<strike>''He well speaks''</strike>)
+
**"it is the only solution '''possible'''" or "it is the only '''possible''' solution"
**yesterday = FRE - In English, the adverb of time "yesterday" may come either before or after the modified: ''Now I go'' or ''I go now''.
+
*Categories with irregular distribution, such as adverbs:
*Adjacency
+
**'''Usually''' I get up early.
**the = FAR (In English, the article "the" has no precedence over other modifiers: ''the small round black leather handbag'' (<strike>''small the round black leather handbag''</strike>).
+
**I '''often''' get headaches.
**after (in "look after") = IMM (In English, the preposition "after" must come right after the base form "come" in order to form the compound "look after": ''We look after them'' (<strike>''We look them after''</strike>)
+
**She speaks English '''well'''.
**down (in "put down") = NEA (In English, the adverb "down" must come right after the base form "put" in order to form the compound "put down", except for the complement: ''Put down that'' (<strike>''Put down it''</strike>)
+
  
==== Observations ====
+
=== Examples ===
;Middle position should be used only for words to be inserted inside others (i.e., between the prefix and the root, or the root and the suffix).
+
*the
::Adverbs coming between auxiliaries and verbs must be defined as premodifiers.
+
**No distribution to be informed in the dictionary because, in English, articles are always premodifiers. The distribution of articles must be informed in the grammar.
;Distribution values are not exclusive:
+
*beautiful
:BEF&AFT means that the word may occur both as a premodifier or as postmodifier;
+
**No distribution to be informed in the dictionary because, in English, adjectives are normally premodifiers. The distribution of adjectives must be informed in the grammar. Only exceptions must be informed in the dictionary.
:BEF&MID means that the word may occur both as a premodifier or as a middle modifier.
+
*very = BEF
;Order and adjacency may be combined to express specific distributions:
+
**In English, the distribution of adverbs is quite variable, and must be informed in the dictionary. The intensifier "very" is a premodifier: ''He is very rich'' (<strike>''He is rich very''</strike>)
:BEF&IMM means that the word occurs right before the modified (as with English intensifiers)
+
*well = AFT
;Order must be informed only when required:
+
**In English, the distribution of adverbs is quite variable, and must be informed in the dictionary. The adverb of manner "well" is a postmodifier: ''He speaks well'' (<strike>''He well speaks''</strike>)
::English intensifiers must come right before the intensified ("very well"), therefore BEF&IMM;
+
*yesterday = FRE
::Adverbs of manner normally comes after the complements ("She read the letter slowly"), therefore "AFT&FAR";
+
**In English, the distribution of adverbs is quite variable, and must be informed in the dictionary. The adverb of time "yesterday" may come either before or after the modified: ''Now I go'' or ''I go now''.
  
=== Representing distribution in the grammar ===
+
=== Observations ===
 +
;Middle position is used only for words to be inserted inside others (i.e., between the prefix and the root, or the root and the suffix).
 +
:Adverbs coming between auxiliaries and verbs must be defined as premodifiers.
 +
 
 +
== Grammar ==
 
In the grammar, distribution is defined through [[S-rule]]s in the following format:
 
In the grammar, distribution is defined through [[S-rule]]s in the following format:
  
  <SYNTACTIC ROLE>(<ORDER>,<ADJACENCY>);
+
  <SYNTACTIC ROLE>(+<DISTRIBUTION>);
  
 
Where:<br />
 
Where:<br />
*<SYNTACTIC ROLE> is the [[Syntax#Syntactic_Roles|syntactic role]] (VA, VC, VS, VH, etc) of the constituent in relation to the head; and
+
*<SYNTACTIC ROLE> is the [[Syntactic roles|syntactic role]] (VA, VC, VS, VH, etc) of the constituent in relation to the head; and
*<ORDER> is the position of the constituent in relation to the head. It may assume one of the following values:
+
*<DISTRIBUTION> is the position of the constituent in relation to the head. It may assume one of the distribution values indicated above ("FNT","BEF",">>",etc).
**'''FNT''' in the beginning of the sentence
+
**'''END''' in the end of the sentence
+
**'''BEF''' or '''<<'''  to the left before a blank space
+
**'''AFT''' or '''>>'''  to the right after a blank space
+
**'''>'''  immediately to the right (i.e., without any blank space)
+
**'''<'''  immediately to the left (i.e., without any blank space)
+
*<ADJACENCY> is the precedence of the constituent in relation to other constituents of the same phrase. It may assume one of the following values:
+
**'''IMM''' immediately: right after or right before
+
**'''NEA''' precedence over other constituents (except IMM)
+
**'''FAR''' distant: no precedence over other constituents
+
  
==== Examples ====
+
=== Examples ===
;VS(<<,IMM);
+
;VS(+BEF);
:the specifier must be generated to the left of the verb before a blank space with precedence over any other constituent
+
:the specifier must be generated to the left of the verb
;VC(>>,FAR);
+
;VC(+AFT);
:the complement must be generated to the right of the verb after a blank space without any precedence over over other constituents
+
:the complement must be generated to the right of the verb
  
==== Observations ====
+
=== Observations ===
;Order and adjacency may be represented in different rules:
+
:VS(<<); (the specifier must be generated to the left of the verb before a blank space)
+
:VS(IMM); (the specifier must be generated with precedence over any other constituent)
+
 
;Complex distribution
 
;Complex distribution
 
:A single distribution rule may contain several distribution operations:
 
:A single distribution rule may contain several distribution operations:
:VS(<<)VS(IMM); (the same as "VS(<<,IMM);")
+
:*VS(+BEF)VC(+AFT); (the specifier will be generated to the left and the complement to the right of the head)
*VS(<<)VC(>>); (the specifier will be generated to the left and the complement to the right of the head)
+
*VS(FAR)VC(FAR); (both the specifier and the complement of the verb have no precedence over other constituents)
+
 
;Conditional distribution
 
;Conditional distribution
:Conditional case-marking may be stated by defining the left side of the s-rule and coindexing it to the right side:
+
:Conditional distribution may be stated by defining the left side of the s-rule and coindexing it to the right side:
*VC(>>); (unconditional distribution: the complement will be always generated to the right of the verb);
+
:*VC(+AFT); (unconditional distribution: the complement will be always generated to the right of the verb);
*VC(PPR):=VC(<<); (conditional distribution: the complement will be generated to the left of the verb if a personal pronoun (PPR);
+
:*VC(PPR):=VC(+BEF); (conditional distribution: the complement will be generated to the left of the verb if a personal pronoun (PPR);
;Adjacency
+
;Use of "+"
:Adjacency must be informed when two constituents are to be generated in the same direction (otherwise, the system will simply follow the order of application of rules defined in the grammar)
+
:As rules are conservative (i.e., features are preserved unless explicitly deleted), the use of "+" is actually optional:
*VC(>>)VA(>>,FAR); (or "VC(>>)VA(>>)VA(FAR);", i.e., the complement comes nearer the head than the adjunct)
+
:*VC(AFT); is the same as VC(+AFT);
Adjacency states a gradient of proximity and should be assigned only to differentiate the priority of generation
+
*VC(>>,NEA)VA(>>); or VC(>>)VA(>>,FAR); but there's no need for <strike>VC(>>,IMM)VA(>>,FAR);</strike>
+
*VS(>>,IMM)VC(>>,NEA)VA(>>); or VS(>>,IMM)VC(>>)VA(>>,FAR); but there's no need for <strike>VS(>>,IMM)VC(>>,NEA)VA(>>,FAR);</strike>
+
Adjacency is limited to three values (IMM, NEA, FAR) because of the binary nature of branching in the [[Syntax|X-bar approach]]. More complex structures should be reorganized as intermediary projections and only then related one another. See [[projection]] for further information.
+
 
;Reordering
 
;Reordering
:Reordering can be done in two different ways:
+
:Reordering can be done in three different ways:
*By [[Ph-rule]]s, if the process involves neighbour items and affects only the surface structure of the phrase;
+
:*By [[L-rule]]s, if the process involves neighbour constituents and affects only the surface structure of the phrase;
*By attribute change (i.e., deleting and adding distribution features), such as in "VC(->>,<<);" (i.e.,delete the "after" attribute and add the "before" attribute)
+
:*By attribute change (i.e., deleting and adding distribution features), such as in "VC(-AFT,+BEF);" (i.e.,delete the "after" attribute and add the "before" attribute), in case of neighbour constituents or neighbour projections
;The symbol '''^''' is used for negation and to control infinite recursion
+
:*By [[movement]], in case of more complex inversions and extraction of constituents
*VC(^>>):=VC(>>); (assign the "after" attribute to the complement of the verb if it does not have it yet)
+
;The symbol '''^''' is used for negation and to control infinite recursion:
 
+
:*VC(^AFT):=VC(AFT); (assign the "after" attribute to the complement of the verb if it does not have it yet)
== UNL ==
+
Word order is not informed in UNL.
+

Latest revision as of 13:38, 20 May 2010

Distribution or precedence refers to the study of the order of the syntactic constituents of a language. In the UNLarium framework, the distribution is informed in the grammar, if general, or in the dictionary, in case of exceptions or categories that do not follow a regular distributional pattern (such as English adverbs). Distribution is not informed in UNL.

Contents

Values

In the UNLarium framework, distribution may assume the following values:

Dictionary

Distribution is to be included in the dictionary in two cases:

  • Exceptions to the general distribution rules, such as in some free order adjectives:
    • "it is the only solution possible" or "it is the only possible solution"
  • Categories with irregular distribution, such as adverbs:
    • Usually I get up early.
    • I often get headaches.
    • She speaks English well.

Examples

  • the
    • No distribution to be informed in the dictionary because, in English, articles are always premodifiers. The distribution of articles must be informed in the grammar.
  • beautiful
    • No distribution to be informed in the dictionary because, in English, adjectives are normally premodifiers. The distribution of adjectives must be informed in the grammar. Only exceptions must be informed in the dictionary.
  • very = BEF
    • In English, the distribution of adverbs is quite variable, and must be informed in the dictionary. The intensifier "very" is a premodifier: He is very rich (He is rich very)
  • well = AFT
    • In English, the distribution of adverbs is quite variable, and must be informed in the dictionary. The adverb of manner "well" is a postmodifier: He speaks well (He well speaks)
  • yesterday = FRE
    • In English, the distribution of adverbs is quite variable, and must be informed in the dictionary. The adverb of time "yesterday" may come either before or after the modified: Now I go or I go now.

Observations

Middle position is used only for words to be inserted inside others (i.e., between the prefix and the root, or the root and the suffix).
Adverbs coming between auxiliaries and verbs must be defined as premodifiers.

Grammar

In the grammar, distribution is defined through S-rules in the following format:

<SYNTACTIC ROLE>(+<DISTRIBUTION>);

Where:

  • <SYNTACTIC ROLE> is the syntactic role (VA, VC, VS, VH, etc) of the constituent in relation to the head; and
  • <DISTRIBUTION> is the position of the constituent in relation to the head. It may assume one of the distribution values indicated above ("FNT","BEF",">>",etc).

Examples

VS(+BEF);
the specifier must be generated to the left of the verb
VC(+AFT);
the complement must be generated to the right of the verb

Observations

Complex distribution
A single distribution rule may contain several distribution operations:
  • VS(+BEF)VC(+AFT); (the specifier will be generated to the left and the complement to the right of the head)
Conditional distribution
Conditional distribution may be stated by defining the left side of the s-rule and coindexing it to the right side:
  • VC(+AFT); (unconditional distribution: the complement will be always generated to the right of the verb);
  • VC(PPR):=VC(+BEF); (conditional distribution: the complement will be generated to the left of the verb if a personal pronoun (PPR);
Use of "+"
As rules are conservative (i.e., features are preserved unless explicitly deleted), the use of "+" is actually optional:
  • VC(AFT); is the same as VC(+AFT);
Reordering
Reordering can be done in three different ways:
  • By L-rules, if the process involves neighbour constituents and affects only the surface structure of the phrase;
  • By attribute change (i.e., deleting and adding distribution features), such as in "VC(-AFT,+BEF);" (i.e.,delete the "after" attribute and add the "before" attribute), in case of neighbour constituents or neighbour projections
  • By movement, in case of more complex inversions and extraction of constituents
The symbol ^ is used for negation and to control infinite recursion
  • VC(^AFT):=VC(AFT); (assign the "after" attribute to the complement of the verb if it does not have it yet)
Software