IAN

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(The name)
(Requirements)
Line 5: Line 5:
  
 
== Requirements ==
 
== Requirements ==
As a universal engine, IAN must be parameterized to the target languages with the following files, to be provided through IAN's interface:
+
As a universal engine, IAN must be parameterized to the source languages with the following files, to be provided through IAN's interface:
 
*The input natural language document, i.e., the document to be UNL-ized
 
*The input natural language document, i.e., the document to be UNL-ized
 
*The NL-UNL (analysis) dictionary, i.e., a lexical database where [[UW]]s are mapped into natural language entries, along with the corresponding features, to be provided according to the [[UNL Dictionary Specs]]
 
*The NL-UNL (analysis) dictionary, i.e., a lexical database where [[UW]]s are mapped into natural language entries, along with the corresponding features, to be provided according to the [[UNL Dictionary Specs]]

Revision as of 00:50, 23 July 2012

IAN is a natural language analysis system. It represents natural language sentences as semantic networks in the UNL format. In its current release, it is a web application developed in Java and available at the UNLdev.

The name

IAN is an acronym for Interactive ANalysis system.

Requirements

As a universal engine, IAN must be parameterized to the source languages with the following files, to be provided through IAN's interface:

  • The input natural language document, i.e., the document to be UNL-ized
  • The NL-UNL (analysis) dictionary, i.e., a lexical database where UWs are mapped into natural language entries, along with the corresponding features, to be provided according to the UNL Dictionary Specs
  • The NL-UNL (analysis) transformation grammar, i.e., a set of of transformation rules used to convert natural language sentences into UNL graphs, to be provided according to the UNL Grammar Specs
  • The NL-UNL (analysis) disambiguation grammar, i.e, a set of disambiguation rules used to improve the results of the tokenization and of the transformation

Functioning

IAN performs the three following movements over the input file:

  • Segmentation, i.e., the division of the input document into a series of processing units (sentences), which are processed one at a time
  • Tokenization, i.e., the identification of the tokens (lexical items) of each sentence of the input document
  • Transformation, i.e., the application of the transformation rules of the grammar over each tokenized sentence in order to represent it as a UNL graph
Software