Lexica

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(UNL-NL)
Line 1: Line 1:
The UNL framework contains three different types of lexical databases: UNL-only, NL-only and UNL-NL.
+
The [[UNL System]] contains three different types of lexical databases: dictionaries, knowledge bases and example bases.  
  
== UNL ==
+
== Dictionaries ==
The UNL lexical databases are the following:
+
In the UNL System, a dictionary is a flat list of entries with their corresponding features. The dictionaries must comply with the structure defined in the [[Dictionary Specs]] and must contain only tags defined in the [[Tagset]]. They are normally provided through the [[UNLarium]], and are divided in three different categories:
*The [[UNL Dictionary]], or simply UNL<sup>dic</sup>, is a flat list of UW's and their semantic features
+
*The [[UNL Dictionary]], or simply UNL<sup>dic</sup>, is a list of UW's and their semantic (language-independent) features
*The [[UNL Knowledge Base]], or simply UNL<sup>kb</sup>, is a network with systematic relations between UW's
+
*The [[NL Dictionary]], or simply NL<sup>dic</sup>, is a list of natural language entries with the corresponding morphological and syntactic (language-dependent) features
*The [[UNL Example Base]], or simply UNL<sup>eb</sup>, is a network with any relations between UW's  
+
*The [[UNL-NL Dictionary]], or simply UNL-NL<sup>dic</sup>, is list of systematic lexical mappings between UW's and natural language entries
These three databases are nested. The UNL<sup>dic</sup> contains UW's and their basic semantic features (such as the information that the UW corresponding to "table" is a nominal concrete concept that belongs to the class of artifacts). The UNL<sup>kb</sup> contains the UNL<sup>dic</sup> and the set of relations that are '''necessary''' to define a UW (such as the information that "table" is a piece of furniture with vertical legs and a flat horizontal surface). The UNL<sup>eb</sup> contains the UNL<sup>kb</sup> and the set of relations that are '''often''' found between UW's (such as the information that tables are normally round or square, that are made of hard materials, etc.). In general, the difference between the UNL<sup>kb</sup> and the UNL<sup>eb</sup> is that the former is dictionary-based (i.e., it tries to represent the information that is normally ascribed in the definitions provided by dictionaries) whereas the latter is corpus-based (i.e., it tries to represent the concept as it appears in the corpus).
+
The UNL Dictionary and the NL Dictionary are monolingual databases, whose entries are interlinked by the UNL-NL Dictionary.  
  
== NL ==  
+
== UNL Knowledge Base (UNL<sup>KB</sup>) ==
*The [[NL Dictionary]], or simply NL<sup>dic</sup>, is a list of natural language entries with the corresponding morphological and syntactic features (such as part of speech, gender, number, case, subcategorization frame, etc.).
+
The [UNL Knowledge Base], or UNL<sup>KB</sup>, is a semantic network with relations that are '''necessary''' to define UW's.
 +
Differently from the UNL Dictionary, which brings only very general semantic features (such as lexical category, semantic class, abstractness, cardinality, etc.), the UNL<sup>kb</sup> contains relations between UW's.  
  
== UNL-NL ==
+
== Example Bases ==
*The [[UNL-NL Dictionary]], or simply UNL-NL<sup>dic</sup>, is list of systematic lexical mappings between UW's and natural language entries
+
In the UNL System, there are two different types of example bases:
 +
*The [[UNL Example Base]], or simply UNL<sup>eb</sup>, is a network with any relations between UW's  
 
*The [[UNL-NL Memory]], or UNL Memory Base, or simply UNL-NL<sup>MB</sup>, is a list of mappings between UNL and a given natural language
 
*The [[UNL-NL Memory]], or UNL Memory Base, or simply UNL-NL<sup>MB</sup>, is a list of mappings between UNL and a given natural language
The main difference between the UNL-NL<sup>dic</sup> and the UNL-NL<sup>MB</sup> is that the former involves only lexical units (i.e., entries defined as such in the UNL and the NL dictionaries) whereas the latter involves translation units, which may include several lexical units.
 

Revision as of 18:34, 17 September 2012

The UNL System contains three different types of lexical databases: dictionaries, knowledge bases and example bases.

Dictionaries

In the UNL System, a dictionary is a flat list of entries with their corresponding features. The dictionaries must comply with the structure defined in the Dictionary Specs and must contain only tags defined in the Tagset. They are normally provided through the UNLarium, and are divided in three different categories:

  • The UNL Dictionary, or simply UNLdic, is a list of UW's and their semantic (language-independent) features
  • The NL Dictionary, or simply NLdic, is a list of natural language entries with the corresponding morphological and syntactic (language-dependent) features
  • The UNL-NL Dictionary, or simply UNL-NLdic, is list of systematic lexical mappings between UW's and natural language entries

The UNL Dictionary and the NL Dictionary are monolingual databases, whose entries are interlinked by the UNL-NL Dictionary.

UNL Knowledge Base (UNLKB)

The [UNL Knowledge Base], or UNLKB, is a semantic network with relations that are necessary to define UW's. Differently from the UNL Dictionary, which brings only very general semantic features (such as lexical category, semantic class, abstractness, cardinality, etc.), the UNLkb contains relations between UW's.

Example Bases

In the UNL System, there are two different types of example bases:

  • The UNL Example Base, or simply UNLeb, is a network with any relations between UW's
  • The UNL-NL Memory, or UNL Memory Base, or simply UNL-NLMB, is a list of mappings between UNL and a given natural language
Software