How to create a UW

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Created page with "The UNL Dictionary is never completed. It is expected to contain all the concepts that are lexicalized in at least one language. These include: *local concepts (such as "i...")
 
Line 3: Line 3:
 
*local named entities (rivers, mountains, beaches, cities, states, neighborhoods, brands, companies, people, etc.)
 
*local named entities (rivers, mountains, beaches, cities, states, neighborhoods, brands, companies, people, etc.)
 
*local products and practices (food, clothing, rituals, festivities, etc.)
 
*local products and practices (food, clothing, rituals, festivities, etc.)
 +
All these concepts, if lexicalized in at least one language, must be included in the UNL Dictionary as [[Universal Words]]..
 +
 +
== Universal Word ==
 +
A [[Universal Word]] is a concept endowed with semantic accessibility. The semantic accessibility is granted when the concept is introduced in the [[UNL Knowledge Base]], i.e., when we connect the concept to the other existing concepts. Thereafter, the concept may be handled even by languages that do not have it yet.<ref>Consider, for instance, the case of "ilunga". "Ilunga" is a word of Tshiluba, a language spoken in the Republic of Congo. The concept conveyed by "ilunga" is not lexicalized in English or French, for instance. In this sense, "ilunga" is not directly translatable to these languages, i.e., we cannot simply replace "ilunga" by an English or French word. But this does not mean that English and French speakers cannot understand the idea conveyed by "ilunga". The only difference is that they will have to decompose the concept in several other discrete concepts (as in "person who is ready to forgive any transgression a first time and then to tolerate it for a second time, but never for a third dime"). This is the role of the UNL Knowledge Base: to interconnect concepts in order for them to be "universally" understandable.</ref><br /> 
 +
 +
 +
 +
 +
 +
  
 
== General Principles ==
 
== General Principles ==
 
UW's must comply with the following principles:
 
UW's must comply with the following principles:
 
;Non-redundancy
 
;Non-redundancy
:There must be no synonymy in the UNL Dictionary. Do not create UW's that have the same meaning of existing UW's. For instance:
+
:There must be no synonymy in the UNL Dictionary. Do not create UW's that have the same meaning of existing UW's. For instance: the English words "to die", “to croak”, “to decease”, “to drop dead”, “to buy the farm”, “to cash in one's chips”, “to give-up the ghost”, “to kick the bucket”, “to pass away”, “to perish”, “to snuff it”, “to pop off”, “to expire”, “to conk”, “to exit”, “to choke”, “to go” and “to pass”, when conveying the meaning of "passing from physical life and lose all bodily attributes and functions necessary to sustain life", must be represented by one single UW: "to die(icl>to change state)". The same happens to cross-language synonyms: the French words "mourir", "décéder", "périr", "s'éteindre" and "finir de vivre" must also be linked to the same UW "to die(icl>to change state)", because they convey the same meaning intended by the English words. 
 +
;Non-ambiguity
 +
:UWs cannot be ambiguous. The UW is made of two parts: the UCL (Uniform Concept Locator) and the UCN (
 +
 
 +
 
 +
 
 +
 
 +
Non-Ambiguity and Non-Redundancy
 +
A given sense may not be represented by more than one UW, and one UW may not have more than one sense. There is no homonymy, synonymy or polysemy in UNL.
 +
Arbitrariness
 +
Simple UW's are names (and not definitions) for senses. The simple UW does not bring much (or any) information about its sense. It is just a label. Any information concerning the sense is expected to be provided by the three different lexical databases available inside the UNL framework: the UNL Dictionary, the UNL Knowledge Base and the UNL Memory.
 +
 
 +
== Notes ==
 +
<references />

Revision as of 15:32, 14 February 2014

The UNL Dictionary is never completed. It is expected to contain all the concepts that are lexicalized in at least one language. These include:

  • local concepts (such as "ilunga", from Tshiluba, which means "a person who is ready to forgive any transgression a first time and then to tolerate it for a second time, but never for a third time");
  • local named entities (rivers, mountains, beaches, cities, states, neighborhoods, brands, companies, people, etc.)
  • local products and practices (food, clothing, rituals, festivities, etc.)

All these concepts, if lexicalized in at least one language, must be included in the UNL Dictionary as Universal Words..

Universal Word

A Universal Word is a concept endowed with semantic accessibility. The semantic accessibility is granted when the concept is introduced in the UNL Knowledge Base, i.e., when we connect the concept to the other existing concepts. Thereafter, the concept may be handled even by languages that do not have it yet.[1]




General Principles

UW's must comply with the following principles:

Non-redundancy
There must be no synonymy in the UNL Dictionary. Do not create UW's that have the same meaning of existing UW's. For instance: the English words "to die", “to croak”, “to decease”, “to drop dead”, “to buy the farm”, “to cash in one's chips”, “to give-up the ghost”, “to kick the bucket”, “to pass away”, “to perish”, “to snuff it”, “to pop off”, “to expire”, “to conk”, “to exit”, “to choke”, “to go” and “to pass”, when conveying the meaning of "passing from physical life and lose all bodily attributes and functions necessary to sustain life", must be represented by one single UW: "to die(icl>to change state)". The same happens to cross-language synonyms: the French words "mourir", "décéder", "périr", "s'éteindre" and "finir de vivre" must also be linked to the same UW "to die(icl>to change state)", because they convey the same meaning intended by the English words.
Non-ambiguity
UWs cannot be ambiguous. The UW is made of two parts: the UCL (Uniform Concept Locator) and the UCN (



Non-Ambiguity and Non-Redundancy A given sense may not be represented by more than one UW, and one UW may not have more than one sense. There is no homonymy, synonymy or polysemy in UNL. Arbitrariness Simple UW's are names (and not definitions) for senses. The simple UW does not bring much (or any) information about its sense. It is just a label. Any information concerning the sense is expected to be provided by the three different lexical databases available inside the UNL framework: the UNL Dictionary, the UNL Knowledge Base and the UNL Memory.

Notes

  1. Consider, for instance, the case of "ilunga". "Ilunga" is a word of Tshiluba, a language spoken in the Republic of Congo. The concept conveyed by "ilunga" is not lexicalized in English or French, for instance. In this sense, "ilunga" is not directly translatable to these languages, i.e., we cannot simply replace "ilunga" by an English or French word. But this does not mean that English and French speakers cannot understand the idea conveyed by "ilunga". The only difference is that they will have to decompose the concept in several other discrete concepts (as in "person who is ready to forgive any transgression a first time and then to tolerate it for a second time, but never for a third dime"). This is the role of the UNL Knowledge Base: to interconnect concepts in order for them to be "universally" understandable.
Software