Lemma

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(How to create a lemma)
(How to create a lemma)
Line 13: Line 13:
 
:The lemma of the lexeme "behind someone's back" should be "behind <person>'s back" (there is no "behind back");
 
:The lemma of the lexeme "behind someone's back" should be "behind <person>'s back" (there is no "behind back");
 
:However, the lemma of the lexeme "take something into account, taking something into account, etc" should be "take into account" (because there can be "take into account", as in "take into account that ...").
 
:However, the lemma of the lexeme "take something into account, taking something into account, etc" should be "take into account" (because there can be "take into account", as in "take into account that ...").
::Obligatory variables, if any, must be expressed by the corresponding value between < >. The values must be expressed in the working language: "person", "personne", "pessoa", etc.
+
::Obligatory variables, if any, must be expressed by the corresponding value between < >. The values must be expressed in the working language in lower case letters: "person", "personne", "pessoa", etc.
  
 
== Examples ==
 
== Examples ==

Revision as of 10:40, 26 January 2010

Lemma is the canonical (citation) form of a lexeme.

Lexemes, as a set of different word forms with different inflectional affixes, but with the same stem, are normally referred to by a citation (default) word form called lemma. The lemma, more generally referred to as headword, is essentially an abstract representation, subsuming all the formal lexical variations which may apply within the same lexeme. For instance, the lexeme comprising the word forms "die", "dies", "died", "dying" is normally referred, in English, by the lemma "die".

How to create a lemma

Lemmas may vary from language to language. In English, for instance, the lemma of a verbal lexeme is the infinitive form ("love", "be"); in Latin, it is the first person of singular of the present of indicative ("amo", "sum"). In the UNLarium framework, the lemma is expected to be the most common citation form of a given lexeme in the lexicographical tradition of the working language (i.e., the infinitive, in English, the first person, in Latin, and so on), provided that:

The lemma should be a word form (i.e., not a root or an affix)
The lemma of the lexeme "die, dies, died, dying" should be "die" and not "d-".
The lemma should be as complete as possible
The lemma of the lexeme "kick the bucket, kicks he bucket, kicking the bucket, kicked the bucket" should be "kick the bucket" and not "kick" or "bucket".
The lemma of the lexeme "me souviens, te souviens, se souvient, etc" (= remember (fr) should be "se souvenir" (and not "souvenir").
The lemma must include obligatory (and only obligatory) variables
The lemma of the lexeme "behind someone's back" should be "behind <person>'s back" (there is no "behind back");
However, the lemma of the lexeme "take something into account, taking something into account, etc" should be "take into account" (because there can be "take into account", as in "take into account that ...").
Obligatory variables, if any, must be expressed by the corresponding value between < >. The values must be expressed in the working language in lower case letters: "person", "personne", "pessoa", etc.

Examples

lexeme word forms lemma
1 here here
2 happy happy
3 unhappy unhappy
4 table, tables table
5 love, loves, loving, loved love
6 am, be, is, are, was, were, being, been be
7 fireman, firemen fireman
8 kick the bucket, kicks the bucket, kicking the bucket, etc kick the bucket
9 take into account, takes into account, taking into account, etc take into account
10 behind one's back behind <PERSON> back
Software