CORE

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Created page with "The UNL Core Dictionary is expected to contain UW's that are expected to be lexicalized in more than two language groups. == Goal == The main goal of the UNL Core Dictionary ...")
 
(Instructions)
Line 27: Line 27:
 
== Instructions ==
 
== Instructions ==
 
#Join the project CORE at UNLWEB>PROJECT
 
#Join the project CORE at UNLWEB>PROJECT
#Create a VERIFICATION assignment at UNLWEB>ASSIGNMENT for the project CORE
+
#Create a VERIFICATION assignment at UNLWEB>ASSIGNMENT for the language UNL and project UNL CORE Dictionary
#Address the corresponding entries
+
#Verify the corresponding entries according to the instructions below:
 +
##Assign, to each UW, a "degree of universality", according to the lexicalization in different language groups (consider "English", "French" and your native language as different language groups: if the concept is lexicalized in all three languages, the UW must receive the value 1; if it is lexicalized only in two, it must receive the value 2; if it is lexicalized only in English, assign the value 3; if the UW is not lexicalized even in English, assign the value 4).
 +
##Choose only one English word as the headword (the most frequent one). Normally, the headword contains all the English words belonging to the same synset. Choose only one as the headword for the UW.
 +
##Choose a semantic root to the UW. The semantic root is a classeme, i.e., an existing UW that defines a general class to which the UW belongs.
 +
##Choose a hypernym to the the UW. The hypernym is an existing UW that defines a general class of which the UW is a type.
 +
##Definition. Provide the definition in case this field is empty.
 +
##Examples. Provide examples, in English, in case this field is empty.
 +
##Abstractness. This field must be used only in case of nominal UWs. A UW is considered "concrete" if conveys a concept that can be perceived by touch; and abstract, otherwise. In case of doubt, leave this field empty.
 +
##Animacy. This field must be used only in case of nominal UWs. A UW is considered "animate" if conveys a concept that may work as an agent; and inanimate, otherwise. In case of doubt, leave this field empty.
 +
##Cardinality. This field must be used only in case of nominal UWs. It is associated to the "semantic number" of the UW, which is language-independent, and may not be the same as the "grammatical number", which is language dependent. For instance: "glasses", in English, is plural, but its semantic number is singular, because it corresponds to a singular object.
 +
##Gender. This field must be used only in case of nominal UWs that refer to animate beings. Note that it corresponds to the "semantic gender", which is language-independent, and not to the "grammatical gender". The UW corresponding to "cow", for instance, is feminine; the one one corresponding to "bull", is masculine. Many UWs do not have any semantic gender (as "cattle", for instance). In this case, leave this field empty.
 +
##Polarity. This field refers to the sentiment conveyed by the word. There are UWs that convey positive senses ("win", "good", "birth") and UWs that convey negative senses ("lose", "bad", "death"). Many UWs do not convey any sentiment at all ("computer", for instance).
 +
##Semantic class. This field has been filled in the WordNet. Change it only in case it is empty.
 +
##Semantic frame. You have to associate a semantic frame to each UW. The semantic frame is the semantic valency of a UW, and it indicates which are the relations that a UW "necessarily" assigns (i.e., which are the necessary specifiers or complements of UWs). Do not think of optional adjuncts, but only of the necessary arguments. For instance, the UW corresponding to the concept "to die" assigns, necessarily, one relation: exp (semantic frame = exp();); the UW corresponding to "to kill" assigns, necessarily, two relations: agt and obj (semantic frame = agt()obj();); the UW "to give" assigns, necessarily, three relations: agt, obj and adr (semantic frame: agt()obj()adr();). Note this is not only related to verbal UWs, but to any UW: there are nominal UWs (such as "construction") and adjective UW (such as "necessary") that also assigns necessary semantic relations. For the relations, use the extended set defined at [[relations]]. Some examples have been provided at UNLARIUM>GRAMMAR>UNL>SEMANTIC FRAME.
 +
##Source language. For the time being, this is English, by default, because all the entries have been extracted out of English lexical databases.
 +
 
 +
=== Observations ===
 +
;Think about concepts rather than about words
 +
:Note that we are dealing with concepts and not with English words. The same English word may belong to several different synsets. So, be careful when analyzing a UW: you should not think about the English word in general, but only about the instance of the English word that is captured by the UW.
 +
;Do not delete any UW
 +
:Although the delete button is available at the interface, do not use it.

Revision as of 14:35, 16 November 2012

The UNL Core Dictionary is expected to contain UW's that are expected to be lexicalized in more than two language groups.

Contents

Goal

The main goal of the UNL Core Dictionary Project is to choose and to classify UW's that are lexicalized in several language gropus.

Methodology

The first candidate entries to the UNL Core Dictionary have been extracted from the intersection of the WordNet3.0 and the following English lexical databases

These entries are expected to be validated and analyzed according to the following categories:

  • Degree of Universality, from 1 to 4, according to the number of language groups in which the entry is lexicalized
  • Headword, which is the citation form of the UW in English
  • Semantic root, which is the UW that defines the set to which the UW belongs
  • Hypernym, which is the UW that defines the class of which the UW is a type
  • Definition, which is the definition of the UW, in English
  • Examples, which are examples of the UW, in English
  • Abstractness, which refers to whether the UW is concrete or abstract
  • Alineability, which refers to whether the UW is alienable or not
  • Animacy, which refers to whether the UW is animate or inanimate
  • Cardinality, which refers to whether the UW is countable or not
  • Gender, which refers to the semantic gender of the UW, if any
  • Polarity, which refers to whether the UW conveys a positive or a negative concept
  • Semantic class, which refers to the semantic class of the UW
  • Semantic frame, which refers to the semantic frame of the UW

Instructions

  1. Join the project CORE at UNLWEB>PROJECT
  2. Create a VERIFICATION assignment at UNLWEB>ASSIGNMENT for the language UNL and project UNL CORE Dictionary
  3. Verify the corresponding entries according to the instructions below:
    1. Assign, to each UW, a "degree of universality", according to the lexicalization in different language groups (consider "English", "French" and your native language as different language groups: if the concept is lexicalized in all three languages, the UW must receive the value 1; if it is lexicalized only in two, it must receive the value 2; if it is lexicalized only in English, assign the value 3; if the UW is not lexicalized even in English, assign the value 4).
    2. Choose only one English word as the headword (the most frequent one). Normally, the headword contains all the English words belonging to the same synset. Choose only one as the headword for the UW.
    3. Choose a semantic root to the UW. The semantic root is a classeme, i.e., an existing UW that defines a general class to which the UW belongs.
    4. Choose a hypernym to the the UW. The hypernym is an existing UW that defines a general class of which the UW is a type.
    5. Definition. Provide the definition in case this field is empty.
    6. Examples. Provide examples, in English, in case this field is empty.
    7. Abstractness. This field must be used only in case of nominal UWs. A UW is considered "concrete" if conveys a concept that can be perceived by touch; and abstract, otherwise. In case of doubt, leave this field empty.
    8. Animacy. This field must be used only in case of nominal UWs. A UW is considered "animate" if conveys a concept that may work as an agent; and inanimate, otherwise. In case of doubt, leave this field empty.
    9. Cardinality. This field must be used only in case of nominal UWs. It is associated to the "semantic number" of the UW, which is language-independent, and may not be the same as the "grammatical number", which is language dependent. For instance: "glasses", in English, is plural, but its semantic number is singular, because it corresponds to a singular object.
    10. Gender. This field must be used only in case of nominal UWs that refer to animate beings. Note that it corresponds to the "semantic gender", which is language-independent, and not to the "grammatical gender". The UW corresponding to "cow", for instance, is feminine; the one one corresponding to "bull", is masculine. Many UWs do not have any semantic gender (as "cattle", for instance). In this case, leave this field empty.
    11. Polarity. This field refers to the sentiment conveyed by the word. There are UWs that convey positive senses ("win", "good", "birth") and UWs that convey negative senses ("lose", "bad", "death"). Many UWs do not convey any sentiment at all ("computer", for instance).
    12. Semantic class. This field has been filled in the WordNet. Change it only in case it is empty.
    13. Semantic frame. You have to associate a semantic frame to each UW. The semantic frame is the semantic valency of a UW, and it indicates which are the relations that a UW "necessarily" assigns (i.e., which are the necessary specifiers or complements of UWs). Do not think of optional adjuncts, but only of the necessary arguments. For instance, the UW corresponding to the concept "to die" assigns, necessarily, one relation: exp (semantic frame = exp();); the UW corresponding to "to kill" assigns, necessarily, two relations: agt and obj (semantic frame = agt()obj();); the UW "to give" assigns, necessarily, three relations: agt, obj and adr (semantic frame: agt()obj()adr();). Note this is not only related to verbal UWs, but to any UW: there are nominal UWs (such as "construction") and adjective UW (such as "necessary") that also assigns necessary semantic relations. For the relations, use the extended set defined at relations. Some examples have been provided at UNLARIUM>GRAMMAR>UNL>SEMANTIC FRAME.
    14. Source language. For the time being, this is English, by default, because all the entries have been extracted out of English lexical databases.

Observations

Think about concepts rather than about words
Note that we are dealing with concepts and not with English words. The same English word may belong to several different synsets. So, be careful when analyzing a UW: you should not think about the English word in general, but only about the instance of the English word that is captured by the UW.
Do not delete any UW
Although the delete button is available at the interface, do not use it.
Software