UNL-NL Memory

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
The '''UNLization Memory''', or simply '''UM''', is a set of mappings between a given natural language and UNL. It improves and to normalizes the results of the [[UNLization]] process, as it contains segments that have been previously UNLized. The UNL UM may be provided in two different formats:
+
The '''UNLization Memory''', or '''UNL Memory Base''', or simply '''UNL<sup>MB</sup>''', is a set of mappings between a given natural language and UNL. It improves and to normalizes the results of the [[UNLization]] process, as it contains segments that have been previously UNLized.<br/><br />
 +
The UNL<sup>MB</sup> may be provided in two different formats:
 
*Extended, in TMX; or
 
*Extended, in TMX; or
 
*Simplified, as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]]
 
*Simplified, as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]]
Line 7: Line 8:
 
== Extended format ==
 
== Extended format ==
  
UNL UM entries in extended format must comply with the [http://www.lisa.org/fileadmin/standards/tmx1.4/tmx.htm Translation Memory eXchange Specs], as follows:
+
UNL<sup>MB</sup> entries in extended format must comply with the [http://www.lisa.org/fileadmin/standards/tmx1.4/tmx.htm Translation Memory eXchange Specs], as follows:
  
 
     <tu>
 
     <tu>
Line 24: Line 25:
 
== Simplified format ==
 
== Simplified format ==
  
UNL UM entries in simplified format must be represented as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]], as follows:
+
UNL<sup>MB</sup> entries in simplified format must be represented as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]], as follows:
  
 
  equ(SOURCE;TARGET)=DC;
 
  equ(SOURCE;TARGET)=DC;

Revision as of 18:30, 14 December 2010

The UNLization Memory, or UNL Memory Base, or simply UNLMB, is a set of mappings between a given natural language and UNL. It improves and to normalizes the results of the UNLization process, as it contains segments that have been previously UNLized.

The UNLMB may be provided in two different formats:


Extended format

UNLMB entries in extended format must comply with the Translation Memory eXchange Specs, as follows:

   <tu>
       <tuv xml:lang="en"><seg>a good deal</seg><tuv>
       <tuv xml:lang="unl"><seg>400059171</seg><tuv>
   </tu>
    

Where:
<tu> is the beginning of the translation unit
</tu> is the end of the translation unit
<tuv> is the beginning translation unit variant
</tuv> is the end of the translation unit variant
<seg> is the beginning of the translation segment
</seg> is the end of the translation segment

Simplified format

UNLMB entries in simplified format must be represented as a set of network disambiguation rules, as follows:

equ(SOURCE;TARGET)=DC;

Where:
equ is the UNL relation for "equivalent";
SOURCE is the source segment;
TARGET is the target segment;
DC is the degree of certainty (i.e., the likelihood of the relation between the SOURCE and the TARGET)

Software