UNL-NL Memory

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
The '''UNLization Memory''', or '''UNL Memory Base''', or simply '''UNL<sup>MB</sup>''', is a set of mappings between a given natural language and UNL. It improves and to normalizes the results of the [[UNLization]] process, as it contains segments that have been previously UNLized.<br/><br />
+
The '''UNLization Memory''', or '''UNL-NL Memory Base''', or simply '''UNL-NL<sup>MB</sup>''', is a set of mappings between a given natural language and UNL. It improves and to normalizes the results of the [[UNLization]] process, as it contains segments that have been previously UNLized.<br/><br />
The UNL<sup>MB</sup> may be provided in two different formats:
+
The UNL-NL<sup>MB</sup> may be provided in two different formats:
 
*Extended, in TMX; or
 
*Extended, in TMX; or
 
*Simplified, as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]]
 
*Simplified, as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]]
Line 8: Line 8:
 
== Extended format ==
 
== Extended format ==
  
UNL<sup>MB</sup> entries in extended format must comply with the [http://www.lisa.org/fileadmin/standards/tmx1.4/tmx.htm Translation Memory eXchange Specs], as follows:
+
UNL-NL<sup>MB</sup> entries in extended format must comply with the [http://www.lisa.org/fileadmin/standards/tmx1.4/tmx.htm Translation Memory eXchange Specs], as follows:
  
 
     <tu>
 
     <tu>
Line 25: Line 25:
 
== Simplified format ==
 
== Simplified format ==
  
UNL<sup>MB</sup> entries in simplified format must be represented as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]], as follows:
+
UNL-NL<sup>MB</sup> entries in simplified format must be represented as a set of [[Grammar_Specs#Disambiguation_Rules|network disambiguation rules]], as follows:
  
 
  equ(SOURCE;TARGET)=DC;
 
  equ(SOURCE;TARGET)=DC;

Revision as of 17:57, 17 September 2012

The UNLization Memory, or UNL-NL Memory Base, or simply UNL-NLMB, is a set of mappings between a given natural language and UNL. It improves and to normalizes the results of the UNLization process, as it contains segments that have been previously UNLized.

The UNL-NLMB may be provided in two different formats:


Extended format

UNL-NLMB entries in extended format must comply with the Translation Memory eXchange Specs, as follows:

   <tu>
       <tuv xml:lang="en"><seg>a good deal</seg><tuv>
       <tuv xml:lang="unl"><seg>400059171</seg><tuv>
   </tu>
    

Where:
<tu> is the beginning of the translation unit
</tu> is the end of the translation unit
<tuv> is the beginning translation unit variant
</tuv> is the end of the translation unit variant
<seg> is the beginning of the translation segment
</seg> is the end of the translation segment

Simplified format

UNL-NLMB entries in simplified format must be represented as a set of network disambiguation rules, as follows:

equ(SOURCE;TARGET)=DC;

Where:
equ is the UNL relation for "equivalent";
SOURCE is the source segment;
TARGET is the target segment;
DC is the degree of certainty (i.e., the likelihood of the relation between the SOURCE and the TARGET)

Software