II UNL Panel

From UNL Wiki
Revision as of 15:55, 13 February 2014 by Martins (Talk | contribs)
Jump to: navigation, search

The II UNL Panel, an associated event to LREC 2014, will be devoted to the nature and role of relations and attributes in the UNL.

Contents

Goal

The main purpose of the UNL Panel is to collect the opinion of specialists, from inside and outside the UNL Community, about technical issues of the UNL, as to prepare the ground for an in-depth revision of the current specifications.

Rationale

Originally proposed more than 15 years ago, the UNL Specs have not escaped from the action of time and have not incorporated yet several recent advances in the domain of natural language processing. Additionally, there has been a claim for better standardization practices in the UNL framework, especially after the results of the large-scale development inside the UNLweb. In order to organize this discussion, the UNDL Foundation divided the subjects into three chapters, to be addressed in three different meetings:

  • Universal Words (the set, notation and properties of UWs), which have been already addressed at the I UNL Panel (COLING 2012), and whose results are available at MARTINS, R. (ed). (2013). Lexical Issues of UNL: Universal Networking Language 2012 Panel. Cambridge: Cambridge Scholars Publishing.
  • Relations and Attributes (the set, notation and properties of relations and attributes), which is the object of this II UNL Panel; and
  • Document structure (format, encoding, schema and validation)

Issues

The questions below illustrate some theoretical and practical issues concerning relations and attributes and have been receiving several different possible answers within the UNL framework. The main goal of II UNL Panel is to discuss which answers would be more appropriate and feasible, considering the state of the art of the theory and technology on natural language processing. We would ask participants to use them as starting points for their presentations, but we would expect them to suggest some general procedures to be adopted in similar cases.

Considering the commitments, assumptions and properties defined at the Introduction to UNL, how would you represent, as a language-independent semantic graph, the following English sentences?

1. The book is on the table
  • a) place(book,table), i.e., as a general "place" relation between "book" and "table", without any reference to the idea that the book is "on" (and not "in", "above", "under" etc.) the table;
  • b) on(book,table), i.e., as a specific "on" relation between "book" and "table", without any reference to the idea that "on" is actually a possible value of "place";
  • c) rel1(book,on)rel2(on,table), i.e., as two different relations "rel1" and "rel2" (which ones?), as if there is no direct relation between "book" and "table";
  • d) place(book, table.@on), i.e., as a general "place" relation between "book" and "table" and by an attribute "@on" assigned to "table", in order to specify its role;
  • e) place.@on(book, table), i.e., as a general "place" relation between "book" and "table" and by an attribute "@on" assigned to the relation itself;
  • f) other
In order to answer this question, consider also the following:
The resulting semantic graph must be suitable for languages that do not lexicalize the copula, i.e., where "the book is on the table" is translated as "the book on the table";
The resulting semantic graph must be suitable for languages that do not lexicalize the difference between "on" and "above", i.e., where "the book is on the table" is translated in the same way as "the book is above the table";
The resulting semantic graph must be suitable for languages that do not lexicalize place relations, i.e., where "the book is on the table" is translated as "the book is tableon", where "on" is a locative case marker and not an adposition
2.



  • Your representation


In English, some prepositions (such as "between") may have valence equal to 3 ("John is between Mary and Peter"). How should we represent that?

  • In English, prepositions can be modified by adverbs (such as in "the book is right on the table"). How these modifications should be represented?
  • Several languages do not use adpositions to represent place relations.




1) "The book is on the table"
How should we represent the relation between "book" and "table" in the sentence "the book is on the table"?

In a semantic network, semantic relations may have different levels of granularity. The sentence "Peter killed Mary" can be represented either as kill(Peter,Mary) or as agent(kill,Peter)patient(kill,Mary). The latter approach - which is associated to the idea of semantic case, or semantic role - is said to be more productive (i.e., generalizable) than the former one, but poses some practical problems. Consider, for instance, the case of "the book is on the table". How should we represent the relation between "book" and "table" in the sentence "the book is on the table"?




1) How many UW's should be recognized in the sentence below? "Charles Dickens is generally regarded as the most important English novelist of the Victorian period" The basic assumption of the UNL approach is that the information conveyed by natural languages can be formally and usefully represented through semantic networks composed of three different types of discrete semantic entities: UW's, relations and attributes. UW's are nodes in the UNL graph; relations are arcs between nodes; and attributes are specifiers that restrict the extension of nodes. This three-layered representation poses several problems to the UNLization as the distinction between these three entities is not always clear. Consider, for instance, the sentence above. How many UW's (either permanent or temporary) should be recognized in this sentence? "Victorian period" should be represented as single UW ("Victorian period") or as two different UW's ("Victorian" and "period")? The verb "to be" should be represented as a UW or as a relation between "Charles Dickens" and "the most important English novelist of the Victorian period"? (Consider also the options "was" and "has been" in the same context) The preposition "of" should be represented as a UW or as a relation between "the most important novelist" and "the Victorian period"? (Consider also the options "since", "from ... on", "in" or "during" instead of "of") "generally regarded as" should be represented by UW's ("generally", "regarded", "as", for instance) or as an attribute (a downtoner, which lowers the truth effect of the declaration) to be assigned to the whole proposition "Charles Dickens is the most important English novelist of the Victorian period"? The adverb "most" should be represented as a UW or as a superlative marker (to be represented as an attribute to be assigned to the adjective "important"?) (Consider also "greatest English novelist" instead of "most important English novelist")



1) Is there a set of semantic relations that can be said to be shared by all human languages? Semantic networks have been used in language description at least since Charles S. Peirce (Manuscript 514, 1909), and as an interlingua for machine translation since 1956 (cf. Sowa 1987). In a semantic network, information is represented as a graph structure composed of nodes (concepts) and arcs between nodes (binary semantic relations between concepts). The idea of semantic binary relations has been proposed in numerous linguistic approaches, the most famous ones being the “structural syntax”, developed by Lucien Tesnière in the 1930’s, and the “semantic case”, presented by Charles Fillmore in 1968. The set of semantic relations, however, is rather controversial, even though involving some basic common concepts, such as “agent”, “patient”, “instrument”, “place”, “time”, etc. Consider, as a matter of example, the semantic relation between "make" and "cake" in a sentence like "John made a cake". In different approaches, this relation has been described as "complement", "patient" or "result". Considering the nature and the role of natural language processing, which is the most productive way for representing such relations, if any? Is there a uniform set of relations that may be used to describe the semantic dependencies within any natural language?



How should we represent the relation between "make" and "cake" in "John made a cake"?

2)

Notes

Software