Inflectional paradigms

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Syntax)
Line 3: Line 3:
 
== When to use inflectional paradigms ==
 
== When to use inflectional paradigms ==
  
Inflectional paradigms must be used in the case of inflectional words (such as nouns, adjectives and verbs), regardless if they are regular or not.
+
Inflectional paradigms must be created in case of regular or almost regular inflective behavior, i.e., whenever there can be stated a regular pattern for inflecting words, such as nouns, adjectives and verbs.
  
 
== When not to use inflectional paradigms ==
 
== When not to use inflectional paradigms ==
  
Inflectional paradigms should not be used in the case of non-inflectional words (such as adverbs) or words that are already inflected (such as personal pronouns).
+
Inflectional paradigms should not be used in case of non-inflectional words (such as adverbs) or words that are already inflected (such as personal pronouns).
 +
Inflectional paradigms should also be avoided in case of irregular behavior, which should be described rather by [[inflectional rules]].  
  
== Syntax ==  
+
== Example ==
  
Inflectional paradigm rules follow the UNL syntactic general formalism:
+
The plural of English nouns is considerably regular and can be treated, in most cases, by the following '''inflectional paradigms''':
  
<DICTIONARY ATTRIBUTE VALUES> “:=” <ACTION> [“,” <ACTION>]* ";"
+
{|
 
+
| Paradigm
where
+
| Rule
:<DICTIONARY ATTRIBUTE VALUES> is a set dictionary tags extracted from the [[UNL Dictionary Tagset]]
+
| Description
:<ACTION> is the action to be performed in the event of the dictionary value (see below)
+
| Example
:“ “ = constant
+
:[ ] = optional
+
:<nowiki>*</nowiki> to be repeated zero or more times
+
 
+
== Dictionary Attribute Values ==
+
The dictionary attribute values should comply with the [[UNL Dictionary Tagset]]. They can be used in isolation or conjoined by “&”.
+
PLR (= PLURAL)
+
1PS&ET1&IND (= FIRST PERSON OF SINGULAR [1PS] + PRESENT [ET1] + INDICATIVE [ IND])
+
 
+
== Actions ==
+
There are three different types of actions that can be performed over the entries. The syntax for each of them is depicted below:
+
{| border="1" align="center" cellpadding="5"
+
!Type
+
!Syntax
+
 
|-
 
|-
|right appending
+
| 1
|<RIGHT DELETION>>”<RIGHT ADDITION>
+
| PLR:=0>"s";
 +
| Add "s" to the end of the word
 +
| boy > boys
 +
|-
 +
| 2
 +
| PLR:="y">"ies";
 +
| Replace "y" by "ies" at the end of the word
 +
| city > cities
 
|-
 
|-
|left appending
+
| 3
|<LEFT ADDITION>”<”<LEFT DELETION>
+
| PLR:=0>"es";
 +
| Add "es" to the end of the word
 +
| kiss > kisses
 
|-
 
|-
|replacement
+
| 4
|<SOURCE>”:”<TARGET>
+
| PLR:="f">"ves";
 +
| Replace "f" by "ves" at the end of the word
 +
| woolf > woolves
 
|}
 
|}
  
where
+
However, there are several special cases that, being very limited, should be treated by '''inflectional rules''' instead of inflectional paradigms:
;<LEFT DELETION>
+
:the string or the number of characters from the beginning of the entry to be deleted before the addition of the LEFT ADDITION.
+
;<LEFT ADDITION>
+
:the string to be added to the beginning of the entry along with its corresponding features
+
;<RIGHT DELETION>
+
:the string or the number of characters from the end of the entry to be deleted before the addition of the RIGHT ADDITION.
+
;<RIGHT ADDITION>
+
:the string to be added to the end of the entry along with its corresponding features
+
;<SOURCE>
+
:the string to be replaced (if empty, it means that the whole string will be replaced).
+
;<TARGET>
+
:the string to be used instead of the source (if empty, it means that the whole entry should be deleted)
+
  
=== Observations ===
+
{|
: Strings must come between double quotes.
+
| Rule
: <LEFT ADDITION> and <RIGHT ADDITION> must comme between parentheses.
+
| Description
: <LEFT ADDITION> and <RIGHT ADDITION> may have as many features as necessary, provided that they are separated by ",".
+
| Case
: Features must comply with the values defined in the [[UNL Dictionary Tagset]].
+
: <LEFT ADDITION> and <RIGHT ADDITION> may be split into several different nodes, each of which enclosed between parentheses.
+
: <LEFT DELETION> and <RIGHT DELETION> may be empty (or equal to 0) if nothing is to be deleted.
+
: <SOURCE> may also be the interval of characters to be replaced. In this case, the number of the beginning character and of the ending character should be informed between square brackets and should be separated with a semicolon.
+
: Blank spaces are not inserted automatically. They can be inserted either as a string (" ") or as a feature (BLK).
+
: [Square brackets] may be used to indicate optional elements: a[b]c = ac, abc
+
: {braces} may be used to indicate alternative elements: a{b,c}d = abd, acd
+
: Phrase types (NP, PP, VP, CP, AP, JP, SP) may be used to indicate embedded phrases in separable words or multiword expressions.
+
 
+
== Examples ==
+
{| border="1" align="center" cellpadding="5"
+
!Type
+
!Rule
+
!Behavior
+
!Examples
+
 
|-
 
|-
|right appending
+
| PLR:="men";
|PLR:=”y”>”ies”
+
| Replace the whole word by "men"
|in case of the feature “PLR” (=plural), the rightmost "y" will be deleted and the "ies" string will be added to the right of the entry
+
| man > men
|baby>babies, lady>ladies
+
|-
 +
| PLR:="mice";
 +
| Replace the whole word by "mice"
 +
| mouse > mice
 
|-
 
|-
|right appending
+
| PLR:="feet";
|PLR:=1>”ies”
+
| Replace the whole word by "feet"
|in case of the feature “PLR” (=plural), the rightmost character will be deleted and the "ies" string will be added to the right of the entry
+
| foot > feet
|baby>babies, lady>ladies
+
 
|-
 
|-
|left appending
+
| PLR:="children";
|NOT:="un"<
+
| Replace the whole word by "children"
|in case of the feature NOT (=negation), the string "un" will be added to the left of the entry, and nothing will be deleted
+
| child > children
|dress>undress
+
 
|-
 
|-
|left appending
+
| ...
|NOT:=”un”<0
+
| ...
|in case of the feature NOT (=negation), the string "un" will be added to the left of the entry, and nothing will be deleted
+
| ...
|dress>undress
+
|-
+
|replacement
+
|PLR:=”oo”:”ee”
+
|in case of the feature "PLR” (=plural), the "oo" string will be replaced by "ee"
+
|foot>feet, tooth>teeth
+
|-
+
|replacement
+
|PLR:=[2;3]:”ee”
+
|in case of the feature "PLR” (=plural), the string "ee" will replace the string that goes from the second to the third character
+
|foot>feet, tooth>teeth
+
|-
+
|replacement
+
|1PS&ET1&IND:=”am”
+
|in case of the features “1PS” (=first person of singular) AND “ET1” (=present tense) AND “IND” (indicative), the whole string will be replaced by “am”
+
|be>am
+
 
|}
 
|}
 +
 +
To choose between inflectional paradigms and inflectional rules is mainly a question of range. If a rule is applicable to several different words, it should be defined as a general inflectional paradigm; if it is applicable to a single word or to a very limited number of cases, it should be defined as an inflectional rule inside the very entry.
 +
 +
== Syntax ==
 +
 +
Inflectional paradigm rules (as well as inflectional rules) should comply with '''[[f-mor]]''', the formalism for writing morphological rules in the UNL framework.
 +
 +
== Predefined paradigms ==
 +
 +
There are two predefined paradigms in the UNLarium:
 +
;INVARIANT
 +
: If the word is not inflectional (case of adverbs in English, for instance) or does not accept any inflectional variant (case of "clothes", used only in plural, or "species", that has the some form in singular and plural). In this latter case, the field "Descriptive Morphology" should explicit the value of the lemma.
 +
;IRREGULAR
 +
: If the word does not follow (entirely) an existing paradigm, as in irregular forms (such as "man", "mouse", "foot" and "children" listed above). In this case, the corresponding inflectional rules should be provided in the field "Inflectional Rules".

Revision as of 11:22, 14 September 2009

Inflectional paradigms are used to generate the inflected forms out of the lemma.

Contents

When to use inflectional paradigms

Inflectional paradigms must be created in case of regular or almost regular inflective behavior, i.e., whenever there can be stated a regular pattern for inflecting words, such as nouns, adjectives and verbs.

When not to use inflectional paradigms

Inflectional paradigms should not be used in case of non-inflectional words (such as adverbs) or words that are already inflected (such as personal pronouns). Inflectional paradigms should also be avoided in case of irregular behavior, which should be described rather by inflectional rules.

Example

The plural of English nouns is considerably regular and can be treated, in most cases, by the following inflectional paradigms:

Paradigm Rule Description Example
1 PLR:=0>"s"; Add "s" to the end of the word boy > boys
2 PLR:="y">"ies"; Replace "y" by "ies" at the end of the word city > cities
3 PLR:=0>"es"; Add "es" to the end of the word kiss > kisses
4 PLR:="f">"ves"; Replace "f" by "ves" at the end of the word woolf > woolves

However, there are several special cases that, being very limited, should be treated by inflectional rules instead of inflectional paradigms:

Rule Description Case
PLR:="men"; Replace the whole word by "men" man > men
PLR:="mice"; Replace the whole word by "mice" mouse > mice
PLR:="feet"; Replace the whole word by "feet" foot > feet
PLR:="children"; Replace the whole word by "children" child > children
... ... ...

To choose between inflectional paradigms and inflectional rules is mainly a question of range. If a rule is applicable to several different words, it should be defined as a general inflectional paradigm; if it is applicable to a single word or to a very limited number of cases, it should be defined as an inflectional rule inside the very entry.

Syntax

Inflectional paradigm rules (as well as inflectional rules) should comply with f-mor, the formalism for writing morphological rules in the UNL framework.

Predefined paradigms

There are two predefined paradigms in the UNLarium:

INVARIANT
If the word is not inflectional (case of adverbs in English, for instance) or does not accept any inflectional variant (case of "clothes", used only in plural, or "species", that has the some form in singular and plural). In this latter case, the field "Descriptive Morphology" should explicit the value of the lemma.
IRREGULAR
If the word does not follow (entirely) an existing paradigm, as in irregular forms (such as "man", "mouse", "foot" and "children" listed above). In this case, the corresponding inflectional rules should be provided in the field "Inflectional Rules".
Software