IX UNL School
From UNL Wiki
(Difference between revisions)
(→Methodology) |
|||
Line 32: | Line 32: | ||
*Generation | *Generation | ||
**[http://www.unlweb.net/resources/mumbai2012/eng50_dic_gen.txt Corpus 50] Sample of the English generation dictionary for the entries appearing in the Corpus 50 | **[http://www.unlweb.net/resources/mumbai2012/eng50_dic_gen.txt Corpus 50] Sample of the English generation dictionary for the entries appearing in the Corpus 50 | ||
− | |||
== Participants == | == Participants == |
Revision as of 19:33, 29 May 2012
Contents |
Goals
- To build the basic modules of a NL-UNL (analysis) grammar
- To build the basic modules of a UNL-NL (generation) grammar
Methodology
- Corpus
- Translate the 50 sentences of Corpus50_eng.txt into your native language. Be as close as possible to the original.
- Save the translated text (without the English original) in a plain text (.txt) file with UTF-8 encoding and upload it to UNLWEB>UNLDEV>PROJECTS>IAN>NL FILES.
- Upload the file Corpus50_unl.txt to UNLWEB>UNLDEV>PROJECTS>EUGENE>UNL DOCUMENTS
- NL-UNL Dictionary (Analysis)
- Extract the word list (i.e., the set of all distinct word forms) appearing in your translation of the Corpus 50
- Create the NL-UNL dictionary for all the word forms following the English model available at English Analysis Dictionary 50. Use only the tags available at the tagset. For further information on the dictionary structure, see Dictionary Specs.
- Save the NL-UNL dictionary in a plain text (.txt) file with UTF-8 encoding and upload it to UNLWEB>UNLDEV>PROJECTS>IAN>DICTIONARIES.
- UNL-NL Dictionary (Generation)
- Localize the UNL-NL dictionary available at English Generation Dictionary 50. The localized version must reflect the word list of your translated corpus. Use only the tags available at the tagset. For further information on the dictionary structure, see Dictionary Specs.
- Save the UNL-NL dictionary in a plain text (.txt) file with UTF-8 encoding and upload it to UNLWEB>UNLDEV>PROJECTS>EUGENE>DICTIONARIES.
- Morphology
- Export the inflectional grammar of your language from UNLARIUM>GRAMMAR>INFLECTIONAL PARADIGMS. If the grammar of your language is not available yet, create the inflectional paradigms for the inflectional entries appearing in the UNL-NL dictionary following the model available at English Inflectional Grammar. For further information, see Inflectional paradigms.
- Save the inflectional grammar in a plain text (.txt) file with UTF-8 encoding and upload it to UNLWEB>UNLDEV>PROJECTS>EUGENE>RULES.
Corpus
- English
- Corpus 50 Training corpus in English (50 sentences), to be manually translated to the target languages, in order to be used as the input for IAN
- Corpus 500 Training corpus in English (500 sentences), to be manually translated to the target languages, in order to be used as the input for IAN (to be provided after the workshop)
- UNL
- Corpus 50 Training corpus in UNL (50 sentences), to be used as the input for EUGENE
- Corpus 500 Training corpus in UNL (500 sentences), to be used as the input for EUGENE (to be provided after the workshop)
Dictionary
- Analysis
- Corpus 50 Sample of the English analysis dictionary for the entries appearing in the Corpus 50
- Generation
- Corpus 50 Sample of the English generation dictionary for the entries appearing in the Corpus 50
Participants
- Aadil Kak (Kashmiri)
- Arulmozi Selvaraj (Tamil)
- Balaji Jagan (Tamil)
- Laishram Rishikanta Meitei (Manipuri)
- Navanath Saharia (Assamese)
- Niladri Sekhar Dash (Bengali)
- Parameswarappa S (Kannada)
- Parteek Kumar (Punjabi)
- Pinkey Nainwani (Sindhi)
- Ranjan Das (Oriya)
- Renuka Devi (Telugu)
- Sachin Pawar (Marathi)
- Shailendra Kumar (Hindi)
- Trupti Nisar (Gujarati)
Schedule
- Jun 11th, 2012 - Monday
- 09:00-10:00 Introduction
- 10:00-12:00 I – Corpus
- 14:00-17:00 II – UNL-NL dictionary
- Jun 12th, 2012 - Tuesday
- 09:00-12:00 III – Morphology (inflectional paradigms)
- 14:00-17:00 IV – NL dictionary
- Jun 13th, 2012- Wednesday
- 09:00-12:00 V – UNL-NL grammar (I)
- 14:00-17:00 V – UNL-NL grammar (II)
- Jun 14th, 2012 - Thursday
- 09:00-12:00 VI – NL-UNL grammar (I)
- 14:00-17:00 VI – NL-UNL grammar (II)
- Jun 15th, 2012 - Friday
- 09:00-12:00 Evaluation
- 14:00-17:00 Discussion