Relations

From UNL Wiki
Revision as of 20:12, 19 August 2013 by Martins (Talk | contribs)
Jump to: navigation, search

In order to form a natural language sentence or a UNL graph, nodes are inter-related by relations. In the UNL framework, there can be three different types of relations between nodes:

  • the linear relation L, which defines the order of the elements in a list
  • syntactic relations (such as adjunct of the noun phrase, complement of the verbal phrase, specifier of the adjective phrase, etc.)
  • semantic relations (such as agent, object, manner, instrument, etc.)


Contents

Basic Symbols

Basic symbols used in the UNL framework
Symbol Definition Example
( ) node (%a)
" " string "went"
[ ] natural language entry (headword) [go]
[[ ]] UW [[to go(icl>to move)]]
// regular expression /a{2,3}/ = aa,aaa
rel(x;y) relation agt(kill;Peter)
^ not ^a = not a
{ | } or {a|b} = a or b
% index for nodes, attributes and values %x
: scope ID :01
# index for sub-NLWs #01
= attribute-value assignment POS=NOU
! rule trigger !PLR
& merge operator %x&%y
? dictionary lookup operator ?[a]

Basic Concepts

Grammar.png
Node
A node is the most elementary unit in the graph. It is the result of the tokenization process, and corresponds to the notion of "lexical item". At the surface level, a natural language sentence is considered a list of nodes, and a UNL graph a set of relations between nodes.
Relation
In order to form a natural language sentence or a UNL graph, nodes are inter-related by relations. In the UNL framework, there are three different types of relations: the linear (list) relation, syntactic relations and semantic relations.
Hyper-Node
A hyper-node is a sub-graph, i.e., a scope: a node containing relations between nodes.
Hyper-Relation
A hyper-relation is a relation between relations.

Notation

Relations are represented by the general syntax

rel(arg1;arg2;...;argn)

Where

  • rel is the name of the relation; and
  • arg1, arg2, ..., are the arguments of the relation, i.e., nodes.

Types

In the UNL framework, there can be three different types of relations:

  • the linear relation L expresses the surface (list) structure of natural language sentences
  • syntactic relations express the syntactic (tree) structure of natural language sentences
  • semantic relations express the semantic (graph) structure of UNL graphs

Examples

Examples of relations:

  • ("a")("b") (a linear relation between two nodes: one having the string "a" and the other having the string "b"
  • L("a";"b") (the same as above)
  • VC(V;NP) (a syntactic relation VC between two nodes: one having the feature V and the other having the feature NP
  • VC("a",V;"b",[[b]],LEX=N,NP) (a syntactic relation VC between two nodes: one having the string "a" and the feature V; and the other having the string "b", the UW b and the features LEX=N and NP)
  • agt("kill";N) (a semantic relation between two nodes: one having the string "kill" and the other having the feature N.

Properties

The linear relation is always binary and is represented in two possible formats
  • L(%x;%y) or
  • (%x)(%y)

where L is the invariant name of the linear relation, and %x and %y are nodes.

Syntactic relations are not predefined, although we have been using a set of binary relations based on the X-bar theory.
Semantic relations constitute a predefined and closed set that can be found here.
Arguments of relations are not commutative.
The order of the elements in a relation affects the result:
(%x)(%y) is different from (%y)(%x)
relation(%x;%y) is different from relation(%y;%x)
Linear and semantic relations are always binary; syntactic relations may be n-ary
L(%x;%y) - linear relation
agt(%x;%y) - semantic relation
VH(%x) - unary syntactic relation
VC(%x;%y) - binary syntactic relation
XX(%x;%y;%z) - possible ternary syntactic relation
Inside each relation, nodes are isolated by semicolon (;).
VC(%x;%y)
VC(%x,%y)
Inside each relation, nodes may be referenced by any of its elements, isolated by comma (,)
("a")([b]) - linear relation between a node where string = "a" and another node where headword = [b]
L([[c]];D) - linear relation between a node where UW = [[c]] and another node having the feature D
VC(%a;%b) - syntactic relation between a node where index = %a and another node where index = %b
agt("a",[a],[[a]],A;"b",[b],[[b]],B) - semantic relation between a node having the feature A where string = "a" AND headword "a" AND UW = [[a]] AND another node having the feature B where string = "b" AND headword = [b] AND UW = [[b]]
Relations may be conjoined through juxtaposition
("a")("b")("c") - two linear relations: one between ("a") and ("b") AND other between ("b") and ("c")
agt(%x;%y)obj(%x;%z) - two semantic relations: one between (%x) and (%y) AND other between (%x) and (%z)
VC([a];[b]),VC([a];[c]) - conjoined relations must not be isolated by comma
Relations may be disjoined through {braces}
{("a")|("b")}("c") - either ("a")("c") or ("b")("c")
{agt(%x;%y)|exp(%x;%y)}obj(%x;%z) - either agt(%x;%y)obj(%x;%z) or exp(%x;%y)obj(%x;%z)
Syntactic and semantic relations may be replaced by regular expressions
/.{2,3}/(%x;%y) - any relation made of two or three characters between %x and %y
Differently from nodes, relations do not have elements (strings, headwords, features and indexes)

In rel("a",[a],[[a]],A;"b",[b],[[b]],B), the elements "a", "b", [a], [b], [[a]], [[b]], A and B belong to the arguments of the relation and not to the relation itself.

The scope of the relation is indicated by 
XX, where XX is the scope ID (the main scope is 00 by default and it is not shown)
  • agt("a";"b") is the same as agt:00("a";"b") (i.e., the relation agt belongs to the main scope)
  • agt:01("a";"b") (the relation agt belongs to the scope :01, i.e., a sub-graph inside the main graph)
Software