Semantic network

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
The main goal of the UNL is to represent, in a machine-tractable format, natural language '''meaning''', i.e., the '''information''' conveyed by natural language documents. In the UNL framework, this information is represented by a '''semantic network''', a network which represents semantic relations between concepts. This semantic network, or '''UNL graph''', is made of three different types of discrete semantic entities: [[Universal Words]], [[Universal Relations]] and [[Universal Attributes]]. Universal Words, or simply UW's, are nodes in the network; Universal Relations are arcs linking UW's; and Universal Attributes are used to delimit the use of UW's. This three-layered representation model is the cornerstone of the UNL, and a distinctive feature over other semantic networks, which normally propose only two levels: edges and vertices.
+
The main goal of the UNL is to represent, in a machine-tractable format, natural language '''meaning''', i.e., the '''information''' conveyed by natural language documents. In the UNL framework, this information is represented by a '''semantic network''', a network which represents semantic relations between concepts. This semantic network, or '''UNL graph''', is made of three different types of discrete semantic entities: [[Universal Words]], [[Universal Relations]] and [[Universal Attributes]]. Universal Words, or simply UW's, are the nodes in the semantic network; Universal Relations are arcs linking UW's; and Universal Attributes are used to delimit the use of UW's. This three-layered representation model is the cornerstone of the UNL, and a distinctive feature over other semantic networks, which normally propose only two levels: edges and vertices.
  
 +
However, this three-layered representation poses several problems to the [[UNLization]] as the distinction between what is supposed to be represented by each unit is not always clear. One  difficulty concerns what is to be represented as a UW (i.e., as a node in the UNL graph) and what is to be represented as a relation between UW's. How many UW's are there, for instance, in the sentence "Charles Dickens was the author of Oliver Twist"? Should "author" be represented as a UW or as a relation between "Charles Dickens" and "Oliver Twist"? Should the verb "to be" be represented as a UW or as a relation between "Charles Dickens" and "author"? Should the preposition "of" be represented as a UW or as a relation between "author" and "Oliver Twist"?
  
== Universal Words (UWs) ==
+
Given the difficulty to clearly define the concept of "concept", the UNL assumes the following principles:
UWs are expected to be associated to [[LRU|lexical realisation units]] in the UNL-NL Dictionary, which is a bidirectional bilingual dictionary mapping lexical items between UNL and NL. A single UW may correspond to several different natural language entries (synonymy), and one single open-class natural language entry may correspond to several UWs (homography). Entries from closed classes are not associated to UWs, but to relations or attributes. Numerals (such as "six", "sixth", "6"), formulae (H<sub>2</sub>0) and untranslatable expressions (such as "http://www.unlweb.net") are represented as [[temporary UWs]], i.e., they are not expected to be included in the UNL-NL dictionaries. The same happens to most proper names. Temporary UWs are automatically assigned the feature TEMP, and may be addressed by named entity recognition modules in UNL-based applications.
+
#If the information can be conveyed, in any language, by inflectional affixes, it is represented by attributes;
 +
#Otherwise, if the information can be conveyed, in any language, by open lexical categories (nouns, adjectives, adverbs and verbs), it is represented by UW's;
 +
#Otherwise, the information is represented by relations.
  
== Universal Attributes ==
+
Let's consider, for instance, the case of "Charles Dik
Universal Attributes normally convey information that may be directly associated to grammar categories, such as [[aspect]], [[degree]], [[gender]], [[number]], [[tense]], [[mood]], [[register]], [[voice]] and [[social deixis]]. This association is made through [[D-rule]]s, such as the following:
+
@pl = PLR;
+
@past = PAS;
+
@passive = PSV;
+
@male = MCL;
+
@past,@progressive = PAS,PGS;
+
Some attributes, however, cannot be directly assigned to any value, and are rather treated as features to be addressed by [[L-rule]]s or [[S-rule]]s:
+
(@ellipsis):=("",-@ellipsis); (L-rule)
+
(@square_bracket,%ref):= ("[")(%ref,-@square_bracket)("]"); (L-rule)
+
VC(@emphasis,%comp):=+IS(%comp)VC(%comp,TRACE); (S-rule)
+
 
+
== Universal Relations ==
+
Universal Relations normally convey information that can be associated to [[S-rule]]s:
+
agt(%source;%target):=VS(%source;%target);
+
tim(%source;%target):=VA(%source;PC([in];%target));
+

Revision as of 20:39, 18 September 2013

The main goal of the UNL is to represent, in a machine-tractable format, natural language meaning, i.e., the information conveyed by natural language documents. In the UNL framework, this information is represented by a semantic network, a network which represents semantic relations between concepts. This semantic network, or UNL graph, is made of three different types of discrete semantic entities: Universal Words, Universal Relations and Universal Attributes. Universal Words, or simply UW's, are the nodes in the semantic network; Universal Relations are arcs linking UW's; and Universal Attributes are used to delimit the use of UW's. This three-layered representation model is the cornerstone of the UNL, and a distinctive feature over other semantic networks, which normally propose only two levels: edges and vertices.

However, this three-layered representation poses several problems to the UNLization as the distinction between what is supposed to be represented by each unit is not always clear. One difficulty concerns what is to be represented as a UW (i.e., as a node in the UNL graph) and what is to be represented as a relation between UW's. How many UW's are there, for instance, in the sentence "Charles Dickens was the author of Oliver Twist"? Should "author" be represented as a UW or as a relation between "Charles Dickens" and "Oliver Twist"? Should the verb "to be" be represented as a UW or as a relation between "Charles Dickens" and "author"? Should the preposition "of" be represented as a UW or as a relation between "author" and "Oliver Twist"?

Given the difficulty to clearly define the concept of "concept", the UNL assumes the following principles:

  1. If the information can be conveyed, in any language, by inflectional affixes, it is represented by attributes;
  2. Otherwise, if the information can be conveyed, in any language, by open lexical categories (nouns, adjectives, adverbs and verbs), it is represented by UW's;
  3. Otherwise, the information is represented by relations.

Let's consider, for instance, the case of "Charles Dik

Software