Universal Words

From UNL Wiki
Revision as of 10:31, 14 January 2010 by Admin (Talk | contribs)
Jump to: navigation, search

Universal Words, or simply UWs, are the words of UNL, and correspond to the nodes - to be interlinked by relations or modified by attributes - in a UNL graph. They are labels for relatively stable units of knowledge (the concepts) that can be associated to natural language open lexical categories (noun, verb, adjective and adverb). The syntax of UWs is defined by the UNL Specs, but the set of UWs is relatively open and is listed in the UNL Dictionary. Additionally, UWs are organized in a hierarchy (the UNL Ontology), are defined in the UNL Knowledge Base and explained in the UNL Encyclopedia, which are the lexical databases for UNL.

Syntax

UWs can be either simple (atomic) or complex (made out of other UWs). In the latter case, they are represented as hypernodes, i.e., subhypergraphs, and follow the syntax for UNL Sentences. A simple UW is an integer which can also be represented, for better readability, as a unique character-string split into two different parts: a root and a suffix. The root can be a word, an expression, a phrase or even an entire sentence in any language. It should be interpreted as a label for a concept. The suffix is used to disambiguate the root.

The syntax for UWs is defined as follows:

UNL REPRESENTATION

<UW> ::= <integer>


NL REPRESENTATION

<UW> ::= <root> <suffix>
<root> ::= <character>+
<suffix> ::= <relation> { “>” , “<” } <root>
<relation> ::= {“agt”, "and", "aoj", ...}


where:
+ to be repeated 1 or more times
< > variable
" " terminal symbol
::= ... is defined as ...
| or
[ ] optional element
{ } alternative element
... to be repeated more than 0 times

Examples

UNL Representation NL Representation
104379964 table(icl>furniture)
table(icl>mobilier)
mesa(icl>mobiliario)
Tisch(icl>Möbel)
стол(icl>мебель)


Semantics

As natural language words, UWs represent concepts (or sets of concepts). These concepts - although may look very similar from culture to culture - are generally said to be culture-dependent, in that each culture will lead to a very particular way of perceiving and categorizing the world. In principle, the set of UWs, which is the UNL Dictionary, is supposed to be as comprehensive as the set of these different individual concepts depicted by different cultures, no matter how specific they are. In that sense, UWs are not to be considered semantic primitives, nor should represent only common concepts. They must include culture-dependent information and every relevant variation among similar concepts. Furthermore, the UNL Dictionary constitutes an open set, subject to permanent increase with new UWs, as UNL is supposed to incessantly incorporate new cultures and cultural changes.

Software