UNL2010

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(ATTRIBUTES)
 
(46 intermediate revisions by one user not shown)
Line 1: Line 1:
The specifications here stated are still experimental and tentative, and have been continuously extended and amended in order to be as comprehensive as possible. They follow the general strategies defined in the [http://www.undl.org UNL 2005 Specifications] (version of June 7th, 2005), but introduce several important changes derived from different UNLization experiences (Cratylus, EOLSS, Le Petit Prince, [[Iglu|IGLU]]). Although formally adopted in the UNDL Foundation tools, projects and certificates, they should not be taken yet as the official specs, as they are still under construction and have not been widely discussed with the UNL Community.  
+
The specifications here stated are still experimental and tentative, and have been continuously extended and amended in order to be as comprehensive as possible. They follow the general strategies defined in the [http://www.undl.org UNL 2005 Specifications] (version of June 7th, 2005), but introduce several important changes derived from different UNLization experiences. Although formally adopted in the UNDL Foundation tools, projects and certificates, they should not be taken yet as the official specs, as they are still under construction and have not been widely discussed with the UNL Community.  
  
== PREMISES ==
+
*[[Introduction to UNL]]
These specifications are derived from three main premises:
+
*[[Universal Words]]
Information conveyed by natural language can be represented by a natural language independent hyper-graph structure.
+
*[[Universal Attributes]]
Texts can be treated as a set of semantic nodes interlinked by semantic relations and modified by semantic attributes.
+
*[[Universal Relations]]
The UNL representation is an interpretation rather than a translation of a given text.
+
*[[UNL sentence|UNL sentence structure]]
The main goal of the UNLization process is to represent the knowledge structure of the source text, which should be detached from its verbal structure. This means that the UNL representation should not be committed to replicate the lexical and the syntactic choices of the original, but should focus in representing, in a language-independent and non-ambiguous format, one of its possible readings, preferably the most conventional one.
+
*[[UNL document|UNL document structure]]
The UNL representation should be as semantically complete as possible.
+
Whenever possible, all the semantic valencies of the original text should be saturated, including anaphora, ellipses, presuppositions and implicatures. Pronouns and pro-forms, for instance, are expected to be replaced by their antecedents, and should not be represented in UNL, except in case of exophoric reference (indefinite pronouns, interrogative pronouns and personal pronouns that are not co-indexed to any existing antecedent).
+
 
+
== THREE-LAYERED REPRESENTATION ==
+
The basic assumption of the UNL approach is that the meaning conveyed by natural language can be formally represented through three different types of semantic units: UWs, attributes and relations. This three-layered representation model is the cornerstone of UNL and its most distinctive feature over other semantic networks, which normally propose only two levels: edges and vertices.
+
[[Image:Unlgraph.jpg|center]]
+
 
+
=== [[UW|UNIVERSAL WORDS (UWs)]] ===
+
[[Image:Uw.jpg|left]]Universal Words, or simply UWs, are the words of UNL, and correspond to the nodes - to be interlinked by relations or modified by attributes - in a UNL graph. They are labels for relatively stable units of knowledge (the concepts) that can be associated to natural language '''open lexical categories (noun, verb, adjective and adverb)'''. The set of UWs is relatively open and is listed in the UNL Dictionary. Additionally, UWs are organized in a hierarchy (the UNL Ontology), are defined in the UNL Knowledge Base (UNLKB) and exemplified in the UNL Example Base (UNLEB), which are the lexical databases for UNL.
+
 
+
UWs can be either simple (atomic) or complex (made out of other UWs). In the latter case, they are represented as hyper-nodes (i.e., sub-graphs). A simple UW is an integer which can also be represented, for better readability, as a unique character-string split into two different parts: a root and a suffix. The root can be a word, an expression, a phrase or even an entire sentence in any language. It should be interpreted as a label for a concept. The suffix, which is always introduced by a UNL relation, is used to disambiguate the root:
+
 
+
{| align=center cellpadding=5
+
|+UW for the concept of "a piece of furniture with tableware for a meal laid out on it"
+
!UNL Representation
+
!
+
!NL Representation
+
|-
+
|align=center|104379964
+
|
+
|table(icl>furniture)<br>table(icl>mobilier)<br>mesa(icl>mobiliario)<br>Tisch(icl>Möbel)<br>стол(icl>мебель)<br>...
+
|}
+
 
+
As language-independent semantic units, UWs are equivalent to the sets of synonyms of a given language, approaching the concept of "synset" devised by the WordNet (Fellbaum, 1998). As a matter of fact, the current UNL Dictionary has been automatically extracted out of the WordNet 3.0, and UWs have been represented as 9-digit strings with the following format:
+
<POS><WORDNETID>
+
where <POS> = {1,2,3,4}, being 1 = noun, 2 = verb, 3 = adjective and 4 = adverb; <br />
+
and <WORDNETID> is the synset ID in the WordNet30.
+
 
+
The current UNL dictionary is, however, only a starting point, as the set of UWs is supposed to be as comprehensive as the set of these different individual concepts depicted by different languages and cultures. In that sense, UWs are not to be considered semantic primitives, nor should represent only common concepts, nor should be derived from any particular language. They must include culture-dependent information and every relevant variation among similar concepts. Furthermore, the UNL Dictionary constitutes an open set, subject to permanent increase with new UWs, as UNL is supposed to incessantly incorporate new cultures and cultural changes.
+
 
+
=== [[Attributes|ATTRIBUTES]] ===
+
[[Image:Attribute.jpg|left]]Attributes are arcs linking a node to itself. In opposition to relations, they correspond to one-place predicates, i.e., functions that take a single argument. In UNL, attributes are always preceded by "@" and have been normally used to represent information conveyed by '''bound morphemes and closed classes''', such as:
+
 
+
 
+
 
+
*grammatical categories (gender, number, tense, aspect, mood, voice, etc)
+
*determiners (articles and demonstratives);
+
*adpositions (prepositions, postpositions and circumpositions);
+
*auxiliary and quasi-auxiliary verbs (auxiliaries, modals, coverbs, preverbs);
+
*interjections;
+
*conjunctions;
+
*adverbs (specifiers);
+
*text structure (.@entry, .@topic, .@qfocus, .@emphasis, .@relative, etc);
+
*speech acts (.@request, .@suggestion, .@offer, etc);
+
*other context-dependent information (such as politeness, metaphor, irony, etc);
+
The current set of attributes is presented below.
+
 
+
{{#tree:id=tagset|openlevels=0|root=Attributes|
+
*animacy
+
**@thing (inanimate)
+
**@person (human)
+
*[[aspect]]
+
**@continuative: continuous
+
**@experiential: experience
+
**@habitual: habitual
+
**@imperfective: uncompleted
+
**@inceptive: beginning
+
**@iterative: repetition
+
**@perfective: completed
+
**@persistent: persistent
+
**@progressive: ongoing
+
**@prospective: imminent
+
**@result: result
+
**@terminative: cessation
+
*[[degree]]
+
**negative
+
***@not: negative
+
***@almost: approximative
+
**positive
+
***@again: iterative
+
***@plus: intensified (very)
+
***@minus: downtoned (a little)
+
***@extra: excessively (too)
+
***@enough: sufficiently (enough)
+
**comparative
+
***@more: comparative of superiority
+
***@less: comparative of inferiority
+
***@equal: comparative of equality
+
**superlative
+
***@most: superlative of superiority
+
***@least: superlative of inferiority
+
*[[figure of speech]]
+
**Schemes
+
***@parallelism: use of similar structures in two or more clauses
+
***@antithesis: juxtaposition of opposing or contrasting ideas
+
***@climax: arrangement of words in order of increasing importance
+
***@anticlimax: Arrangement of words in order of decreasing importance
+
***@anacoluthon: change in the syntax within a sentence
+
***@anastrophe: inversion of the usual word order
+
***@parenthesis: insertion of a clause or sentence in a place where it interrupts the natural flow of the sentence
+
***@apposition: placing of two elements side by side, in which the second defines the first
+
***@ellipsis: omission of words
+
***@asyndeton: omission of conjunctions between related clauses
+
***@brachylogia: omission of conjunctions between a series of words
+
***@alliteration: series of words that begin with the same letter or sound alike
+
***@anaphora: repetition of the same word or group of words at the beginning of successive clauses
+
***@anadiplosis: repetition of a word at the end of a clause at the beginning of another
+
***@antanaclasis: repetition of a word in two different senses
+
***@antimetabole: repetition of words in successive clauses, in reverse order
+
***@assonance: repetition of vowel sounds, most commonly within a short passage of verse
+
***@chiasmus: reversal of grammatical structures in successive clauses
+
***@consonance: repetition of consonant sounds without the repetition of the vowel sounds
+
***@epanalepsis: repetition of the initial word or words of a clause or sentence at the end of the clause or sentence
+
***@pleonasm: Use of superfluous or redundant words
+
***@polyptoton: repetition of words derived from the same root
+
***@polysyndeton: repetition of conjunctions
+
***@symploce: combination of anaphora and epistrophe
+
**Tropes
+
***@anthropomorphism: Ascribing human characteristics to something that is not human, such as an animal or a god (see zoomorphism)
+
***@antiphrasis: Word or words used contradictory to their usual meaning, often with irony
+
***@antonomasia: Substitution of a phrase for a proper name or vice versa
+
***@catachresis: use an existing word to denote something that has no name in the current language
+
***@double_negative: Grammar construction that can be used as an expression and it is the repetition of negative words
+
***@dysphemism: Substitution of a harsher, more offensive, or more disagreeable term for another. Opposite of euphemism
+
***@epanorthosis: Immediate and emphatic self-correction, often following a slip of the tongue
+
***@euphemism: Substitution of a less offensive or more agreeable term for another
+
***@hyperbole: Use of exaggerated terms for emphasis
+
***@irony: Use of word in a way that conveys a meaning opposite to its usual meaning
+
***@metaphor: Stating one entity is another for the purpose of comparing them in quality
+
***@metonymy: Substitution of a word to suggest what is really meant
+
***@onomatopoeia: Words that sound like their meaning
+
***@oxymoron: Using two terms together, that normally contradict each other
+
***@paradox: Use of apparently contradictory ideas to point out some underlying truth
+
***@paronomasia: A form of pun, in which words similar in sound but with different meanings are used
+
***@periphrasis: Using several words instead of few
+
***@repetition: Repeated usage of word(s)/group of words in the same sentence to create a poetic/rhythmic effect
+
***@synecdoche: Form of metonymy, in which a part stands for the whole
+
***@synesthesia: Description of one kind of sense impression by using words that normally describe another.
+
***@zoomorphism: Applying animal characteristics to humans or gods
+
*[[gender]]
+
**@male
+
**@female
+
*[[lexical category]]
+
**@adjective
+
**@adverb
+
**@noun
+
**@verb
+
*[[modality]]
+
**@ability
+
**@advice
+
**@agreement
+
**@assertion
+
**@assumption
+
**@belief
+
**@command
+
**@condition
+
**@conclusion
+
**@confirmation
+
**@consequence
+
**@conviction
+
**@decision
+
**@determination
+
**@deduction
+
**@desire
+
**@doubt
+
**@exclamation
+
**@exhortation
+
**@expectation
+
**@fear
+
**@hope
+
**@hypothesis
+
**@intention
+
**@interrogation
+
**@invitation
+
**@judgement
+
**@narrative
+
**@necessity
+
**@obligation
+
**@opinion
+
**@permission
+
**@possibility
+
**@probability
+
**@prediction
+
**@presumption
+
**@prohibition
+
**@promise
+
**@regret
+
**@request
+
**@speculation
+
**@suggestion
+
**@threat
+
**@warning
+
*[[person]]
+
**@1 (first person: speaker)
+
**@2 (second person: addressee)
+
**@3 (third person)
+
*[[polarity]]
+
**@yes (affirmative)
+
**@not (negative)
+
**@maybe (dubitative)
+
*[[quantification]]
+
**@no (none)
+
**@any (any)
+
**@all (all)
+
**@every (every)
+
**@singular (default)
+
**@pl (plural)
+
***@dual
+
***@trial
+
***@quadrual
+
***@paucal
+
***@multal
+
*[[register]]
+
**@archaic
+
**@colloquial
+
**@dialect
+
**@jargon
+
**@slang
+
**@taboo
+
*[[social deixis]]
+
**@familiar
+
**@intimate
+
**@polite
+
**@equivalent
+
**@inferior
+
**@superior
+
**@reverential
+
*[[specification]]
+
**@def (definite)
+
***@both (both)
+
***@distal (far from the speaker)
+
***@each (each)
+
***@either (either)
+
***@medial (near the addressee)
+
***@other (other)
+
***@own (own)
+
***@proximal (near the speaker)
+
***@same (same)
+
***@such (such)
+
**@indef (indefinite)
+
***@certain (certain)
+
***@wh
+
*[[time]]
+
**absolute time
+
***@past: at a time before the moment of utterance
+
***@present: at the moment of utterance
+
***@future: at a time after the moment of utterance
+
***@recent: close to the moment of utterance
+
***@remote: remote from the moment of utterance
+
**relative time
+
***@anterior: before some other time other than the time of utterance
+
***@posterior: after some other time other than the time of utterance
+
*[[voice]]
+
**@active: He built this house in 1895
+
**@passive: This house was built in 1895.
+
**@reflexive: He killed himself.
+
**@reciprocal: They killed each other.
+
}}
+
 
+
=== [[Relations|RELATIONS]] ===
+
[[Image:Relation.jpg|left]]Relations, formerly known as "links", are labelled arcs connecting a node to another node in a UNL graph. They correspond to two-place semantic predicates holding between two [[Universal Words]]. In UNL, relations have been normally used to represent semantic cases or thematic roles (such as agent, object, instrument, etc.) associated to the interpretation of '''syntactic relations (such as specification, complementation and adjunction)'''. These functions are binary and directed (from a source to a target) and are claimed to be universal.
+
 
+
Relations are organized in a hierarchy where lower nodes subsume upper nodes. The topmost level is the relation "rel", which simply indicates that there is a relation between two UWs. The following level brings four general relations: '''participant''' (ptp), for the necessary arguments (subject and complements) of verbal predicates; '''attribute''' (aoj), for the necessary arguments (subject and complement) of nominal predicates; '''specifier''' (mod), for general specifiers; and '''adjunct''' (adj), for general adjuncts, including time, location and manner.
+
 
+
{{#tree:id=relations|openlevels=0|root=rel|
+
*ptp (participant)
+
**agt (agent, cause or natural force)
+
**cag (co-agent)
+
**obj (patient, theme)
+
**cob (co-object)
+
**ptn (partner)
+
**ins (instrument)
+
**ben (beneficiary)
+
**gol (recipient, addressee)
+
**aoj (experiencer)
+
*aoj (attribute)
+
**icl (hyponymy)
+
**iof (instance)
+
**pof (meronymy)
+
**nam (name)
+
**cnt (content, theme)
+
**equ (synonymy)
+
*mod (specifier)
+
**pos (possessor)
+
**qua (quantity)
+
**frm (origin)
+
**to (destination)
+
*adj (adjunct)
+
**and (conjunction)
+
**or (disjunction)
+
**plc (location)
+
***plf (initial place)
+
***plt (final place, direction, goal)
+
***via (intermediate place, path)
+
***scn (logical place)
+
**tim (time)
+
***tmf (initial time)
+
***tmt (final time)
+
***dur (duration)
+
***coo (co-occurrence)
+
***seq (sequence)
+
**man (manner)
+
***con (condition)
+
***met (method)
+
***pur (purpose)
+
***src (initial state)
+
***gol (final state)
+
***bas (basis for comparison)
+
***per (proportion, rate, distribution)
+
}}
+

Latest revision as of 19:08, 16 August 2013

The specifications here stated are still experimental and tentative, and have been continuously extended and amended in order to be as comprehensive as possible. They follow the general strategies defined in the UNL 2005 Specifications (version of June 7th, 2005), but introduce several important changes derived from different UNLization experiences. Although formally adopted in the UNDL Foundation tools, projects and certificates, they should not be taken yet as the official specs, as they are still under construction and have not been widely discussed with the UNL Community.

Software