UNL-NL Memory

From UNL Wiki

Revision as of 11:46, 8 December 2010 by Martins (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The UNLization Memory, or simply UM, is a set of mappings between a given natural language and UNL. It is claimed to improve the results of UNLization process, as it provides them with extralinguistic information normally required for solving ambiguities, anaphora and co-reference in natural language analysis and generation. The UNL KB should be provided as XML table whose schema is presented below.

The UNL UM may be provided in two different formats:

Extended, in TMX; or
Simplified, as a set of network disambiguation rules

Extended format

UNL UM entries in extended format must comply with the [Translation Memory eXchange Specs], as follows:

   <tu>
       <tuv xml:lang="en"><seg>a good deal</seg><tuv>
       <tuv xml:lang="unl"><seg>400059171</seg><tuv>
   </tu>

Where:
<tu> is the beginning of the translation unit
</tu> is the end of the translation unit
<tuv> is the beginning translation unit variant
</tuv> is the end of the translation unit variant
<seg> is the beginning of the translation segment
</seg> is the end of the translation segment

Simplified format

UNL UM entries in simplified format must have the following structure:

equ(SOURCE;TARGET)=DC;

Where:
equ is the UNL relation for "equivalent";
SOURCE is the source segment;
TARGET is the target segment;
DC is the degree of certainty (i.e., the likelihood of the relation between the SOURCE and the TARGET)

UNL-NL Memory

Extended format

Simplified format

Views

Personal tools

Search

UNL

Lingware

Software

UNL Program

Navigation

Toolbox

Print/export