RC-A1: Difference between revisions

From UNLwiki
Jump to navigationJump to search
imported>Martins
No edit summary
imported>Henokmeskele2007@gmail.com
No edit summary
 
(116 intermediate revisions by one other user not shown)
Line 1: Line 1:
The Corpus<sup>500</sup> is an experimental corpus used to prepare the initial versions of the grammar for sentence-based [[UNLization]] and [[NLization]], using [[IAN]] and [[EUGENE]], respectively. It comprises a list of 500 sentences in English and their corresponding graphs in UNL, and is supposed to cover very basic linguistic phenomena.  
The UC-A1 is an experimental corpus used to prepare the initial versions of the grammar for sentence-based [[UNLization]] and [[NLization]], using [[IAN]] and [[EUGENE]], respectively. It comprises a list of 50 structures in UNL, and is supposed to cover very basic linguistic phenomena.  


== The corpus<sup>500</sup> ==  
== The corpus ==
 
The corpus UCA1 was extracted from a simplified and translated version of "The Hare and the Tortoise", by Aesop.
*The whole corpus in one single file
*[http://www.unlweb.net.br/resources/UCA1/uca1_eng.txt UC-A1 in English], to be translated (manually) in order to be used as the input for the UNLization process (with [[IAN]])
**[http://www.unlweb.net.br/resources/geneva2012/corpus_eng.txt Corpus 500] Experimental corpus in English (500 sentences), to be manually translated to the target languages, in order to be used as the input for IAN
*[http://www.unlweb.net.br/resources/UCA1/uca1_unl.txt UC-A1 in UNL], to be used "as is", as the input for the NLization process (with [[EUGENE]])
*UNL
**[http://www.unlweb.net.br/resources/geneva2012/corpus_unl.txt Corpus 500], Experimental corpus in UNL (500 sentences), to be used as the input for EUGENE
*Corpus 500 according to the complexity of the graphs (the same as above, but split in different files)
{| border="1" cellpadding="2" align=center
|+Corpus
!Order
!Description
!Analysis (English original)
!Generation (UNL)
|-
|0
|Training Corpus (Corpus 50)
|[http://www.unlweb.net.br/resources/mumbai2012/corpus50_eng.txt Corpus 50]
|[http://www.unlweb.net.br/resources/mumbai2012/corpus50_unl.txt Corpus 50]
|-
|1
|Temporary entries
|[http://www.unlweb.net.br/resources/geneva2012/temp_org.txt temp_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/temp_unl.txt temp_unl.txt]
|-
|2
|Entries with no attribute or relation
|[http://www.unlweb.net.br/resources/geneva2012/attribute0_org.txt attribute0_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/attribute0_unl.txt attribute0_unl.txt]
|-
|3
|one-attribute entries
|[http://www.unlweb.net.br/resources/geneva2012/attribute1_org.txt attribute1_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/attribute1_unl.txt attribute1_unl.txt]
|-
|4
|two-attribute entries
|[http://www.unlweb.net.br/resources/geneva2012/attribute2_org.txt attribute2_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/attribute2_unl.txt attribute2_unl.txt]
|-
|5
|three-attribute entries
|[http://www.unlweb.net.br/resources/geneva2012/attribute3_org.txt attribute3_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/attribute3_unl.txt attribute3_unl.txt]
|-
|6
|one-relation entries
|[http://www.unlweb.net.br/resources/geneva2012/relation1_org.txt relation1_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/relation1_unl.txt relation1_unl.txt]
|-
|7
|two-relation entries
|[http://www.unlweb.net.br/resources/geneva2012/relation2_org.txt relation2_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/relation2_unl.txt relation2_unl.txt]
|-
|8
|three-relation entries
|[http://www.unlweb.net.br/resources/geneva2012/relation3_org.txt relation3_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/relation3_unl.txt relation3_unl.txt]
|-
|9
|four-relation entries
|[http://www.unlweb.net.br/resources/geneva2012/relation4_org.txt relation4_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/relation4_unl.txt relation4_unl.txt]
|-
|10
|five-relation entries
|[http://www.unlweb.net.br/resources/geneva2012/relation5_org.txt relation5_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/relation5_unl.txt relation5_unl.txt]
|-
|11
|six-relation entries
|[http://www.unlweb.net.br/resources/geneva2012/relation6_org.txt relation6_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/relation6_unl.txt relation6_unl.txt]
|-
|12
|numbers and numerals
|[http://www.unlweb.net.br/resources/geneva2012/numbers.txt numbers_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/numbers.txt numbers_unl.txt]
|-
|13
|expressions of time
|[http://www.unlweb.net.br/resources/geneva2012/time.txt time_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/time.txt time_unl.txt]
|-
|14
|relative clauses
|[http://www.unlweb.net.br/resources/geneva2012/relatives.txt relatives_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/relatives.txt relatives_unl.txt]
|-
|15
|special issues
|[http://www.unlweb.net.br/resources/geneva2012/problems.txt problems_org.txt]
|[http://www.unlweb.net.br/resources/geneva2012/problems.txt problems_unl.txt]
|}

Latest revision as of 11:56, 2 June 2018

The UC-A1 is an experimental corpus used to prepare the initial versions of the grammar for sentence-based UNLization and NLization, using IAN and EUGENE, respectively. It comprises a list of 50 structures in UNL, and is supposed to cover very basic linguistic phenomena.

The corpus

The corpus UCA1 was extracted from a simplified and translated version of "The Hare and the Tortoise", by Aesop.

  • UC-A1 in English, to be translated (manually) in order to be used as the input for the UNLization process (with IAN)
  • UC-A1 in UNL, to be used "as is", as the input for the NLization process (with EUGENE)