Tagset

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Tree of attributes and values)
(Tree of attributes and values)
 
(9 intermediate revisions by one user not shown)
Line 3: Line 3:
 
== When to use the UNDLF Tagset ==
 
== When to use the UNDLF Tagset ==
  
The UNDLF Tagset is required for providing lexical resources (dictionary entries and grammar rules) in the [http://www.undlfoundation.org/unlarium UNL<sup>arium</sup>] framework, which has been already prepared to accept only the tags here presented. In most cases, the use of tags is rather unnoticeable and effortless, since users are supposed to make higher-level choices ("adjective", for instance) which will be internally represented through the corresponding authorized labels ("ADJ"). However, in several circumstances, as when creating inflectional paradigms or subcategorization frames, users are expected to address more fine-grained linguistic phenomena that may require a specialized metalanguage. That's exactly the purpose of this tagset: to provide the technical means for describing any linguistic behaviour. And it should do that in a strongly standardised way, i.e., so that others could easily understand and exploit the data for their own benefit.
+
The UNDLF Tagset is required for providing lexical resources (dictionary entries and grammar rules) in the [http://www.undlfoundation.org/unlarium UNLarium] framework. Indeed, the whole environment has been already prepared to accept only the tags here presented. In most cases, the use of tags is rather unnoticeable and effortless, since users are supposed to make higher-level choices ("adjective", for instance) which will be internally represented through the corresponding authorized labels ("ADJ"). However, in several circumstances, as when creating inflectional paradigms or subcategorization frames, users are expected to address more fine-grained linguistic phenomena that may require a specialized metalanguage. That's exactly the purpose of this tagset: to provide the technical means for describing any linguistic behaviour. And it should do that in a strongly standardised way, i.e., so that others could easily understand and exploit the data for their own benefit.
  
 
== General Guidelines ==  
 
== General Guidelines ==  
Line 18: Line 18:
 
*Negation is represented by prefixation with "N-" (past = PAS, nonpast = NPAS).
 
*Negation is represented by prefixation with "N-" (past = PAS, nonpast = NPAS).
  
We have tried to stick to the standard abbreviations proposed by the [http://www.eva.mpg.de/lingua/resources/glossing-rules.php Leipzig Glossing Rules] and by David Crystal in ''A dictionary of Linguistics and Phonetics'' (2008), as much as they comply with the rules above. The resulting set of tags, which is still subject to additions and revisions, is presented below. For the time being, the definitions and examples have been extracted out of the ''Glossary of Linguistic Terms'' (Loos et alii), available at [http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/ SIL International]. The tags are expected to migrate to an on-line environment, still under construction, where accredited linguists will have the opportunity to enhance and improve this repertoire.
+
We have tried to stick to the standard abbreviations proposed by the [http://www.eva.mpg.de/lingua/resources/glossing-rules.php Leipzig Glossing Rules] and by David Crystal in ''A dictionary of Linguistics and Phonetics'' (2008), as much as they comply with the rules above. The resulting set of tags, which is still subject to additions and revisions, is presented below. For the time being, the definitions and examples have been extracted out of the ''Glossary of Linguistic Terms'' (Loos et alii), available at [http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/ SIL International]. The tags are expected to migrate to an on-line environment, still under construction, where accredited linguists will have the opportunity to enhance and to improve this repertoire.
  
 
== Tree of attributes and values ==
 
== Tree of attributes and values ==
Line 25: Line 25:
  
 
  [http://www.unlweb.net/unlarium/dictionary/export_tagset.php List of tags in alphabetical order]
 
  [http://www.unlweb.net/unlarium/dictionary/export_tagset.php List of tags in alphabetical order]
{{#tree:id=tagset|openlevels=0|root=att|
+
 
*[[animacy]] (ANIA)
+
{{#tree:id=tagset|openlevels=0|root=Tags|
**@person
+
 
**@thing
+
*[[abstractness]] (ABN)
*[[aspect]] (ASPA)
+
**abstract (ABT)
**@causative: causative
+
**concrete (CCT)
**@continuative: continuous
+
*[[adjacency]] (AJC)
**@experiential: experience
+
**immediate (AJ0)
**@habitual: habitual
+
**nearest (AJ1)
**@imperfective: uncompleted
+
**near (AJ2)
**@inceptive: beginning
+
**distant (AJ3)
**@inchoative: change of state
+
**most distant (AJ4)
**@iterative: repetition
+
*[[agreement]] (AGR)
**@perfect: perfect
+
**assigns case (ACAS)
**@perfective: completed
+
**assigns gender (AGEN)
**@permissive: permissive
+
**assigns number (ANUM)
**@persistent: persistent
+
**assigns person (APER)
**@progressive: ongoing
+
**assigns tense (ATNS)
**@prospective: imminent
+
**receives case (RCAS)
**@result: result
+
**receives gender (RGEN)
**@terminative: cessation
+
**receives number (RNUM)
*[[degree]] (DEGA)
+
**receives person (RPER)
**@almost: approximative
+
**receives tense (RTNS)
**@also: repetitive
+
*alienability (ALY)
**positive
+
**alienable (ALI)
***@again: iterative
+
**unalienable (NALI)
***@emphasis: emphasis
+
*[[animacy]] (ANI)
***@enough: sufficiently (enough)
+
**animate (ANM)
***@extra: excessively (too)  
+
**inanimate (NANM)
***@minus: downtoned (a little)
+
*[[aspect]] (ASP)
***@plus: intensified (very)
+
**aorist (AOR)
**comparative
+
**causative (CAU)
***@more: comparative of superiority
+
**perfective (PFV)
***@less: comparative of inferiority
+
**imperfective (NPFV)
***@equal: comparative of equality
+
***continuative (CTN)
**superlative
+
****progressive (PGS)
***@most: superlative of superiority
+
***habitual (HAB)
***@least: superlative of inferiority
+
***iterative (ITE)
*[[emotion]]s (FEEL)
+
**perfect (PFC)
**@anger
+
***experiential perfect aspect (EXP)
**@attention
+
***perfect of persistent situation (PSS)
**@consent
+
***perfect of recent past (PRP)
**@contentment
+
***perfect of result (RES)
**@disagreement
+
**prospective (PPT)
**@discontentment
+
**inceptive (ICP)
**@dissent
+
**terminative (TER)
**@hesitation
+
*cardinality (CAR)
**@pain
+
**one single referent (ONE)
**@relief
+
**a pair of referents (TWO)
**@surprise
+
**three referents (TRE)
**@weariness
+
**countable (CTB)
*[[figure of speech]] (FIGA)
+
**uncountable (NCTB)
**Schemes
+
**collective (COL)
***@brachylogia: omission of conjunctions between a series of words
+
**more than one referent (PLU)
***@chiasmus: reversal of grammatical structures in successive clauses
+
*[[case]] (CAS)
***@climax: arrangement of words in order of increasing importance
+
**abessive (ABE)
***@consonance: repetition of consonant sounds without the repetition of the vowel sounds
+
**ablative (ABL)
***@ellipsis: omission of words
+
**accusative (ACC)
***@epanalepsis: repetition of the initial word or words of a clause or sentence at the end of the clause or sentence
+
**adessive (ADE)
***@interruption: insertion of a clause or sentence in a place where it interrupts the natural flow of the sentence
+
**allative (ALL)
***@parallelism: use of similar structures in two or more clauses
+
**absolutive (ABS)
***@pleonasm: Use of superfluous or redundant words
+
**benefactive (BEN)
***@polyptoton: repetition of words derived from the same root
+
**comitative (CMT)
***@polysyndeton: repetition of conjunctions
+
**construct state (CTS)
***@symploce: combination of anaphora and epistrophe
+
**dative (DAT)
**Tropes
+
**delative (DEL)
***@anthropomorphism: Ascribing human characteristics to something that is not human, such as an animal or a god (see zoomorphism)
+
**elative (ELA)
***@antiphrasis: Word or words used contradictory to their usual meaning, often with irony
+
**equative (EQU)
***@antonomasia: Substitution of a phrase for a proper name or vice versa
+
**ergative (ERG)
***@catachresis: use an existing word to denote something that has no name in the current language
+
**essive (ESS)
***@double_negative: Grammar construction that can be used as an expression and it is the repetition of negative words
+
**genitive (GNT)
***@dysphemism: Substitution of a harsher, more offensive, or more disagreeable term for another. Opposite of euphemism
+
**hortative (HOR)
***@epanorthosis: Immediate and emphatic self-correction, often following a slip of the tongue
+
**illative (ILL)
***@euphemism: Substitution of a less offensive or more agreeable term for another
+
**inessive (INE)
***@hyperbole: Use of exaggerated terms for emphasis
+
**instrumental (INS)
***@irony: Use of word in a way that conveys a meaning opposite to its usual meaning
+
**lative (LAT)
***@metaphor: Stating one entity is another for the purpose of comparing them in quality
+
**locative (LOC)
***@metonymy: Substitution of a word to suggest what is really meant
+
**nominative (NOM)
***@onomatopoeia: Words that sound like their meaning
+
**oblique (OBL)
***@oxymoron: Using two terms together, that normally contradict each other
+
**prolative (PLT)
***@paradox: Use of apparently contradictory ideas to point out some underlying truth
+
**prepositional (PPL)
***@paronomasia: A form of pun, in which words similar in sound but with different meanings are used
+
**partitive (PTT)
***@periphrasis: Using several words instead of few
+
**superessive (SPE)
***@repetition: Repeated usage of word(s)/group of words in the same sentence to create a poetic/rhythmic effect
+
**terminative (TRM)
***@synecdoche: Form of metonymy, in which a part stands for the whole
+
**translative (TLT)
***@synesthesia: Description of one kind of sense impression by using words that normally describe another.
+
**vocative (VOC)
***@zoomorphism: Applying animal characteristics to humans or gods
+
*defineteness (DFN)
*[[gender]] (GENA)
+
**definite (DEF)
**@female
+
**generic (GNR)
**@male
+
**indefinite (NDEF)
**@neutral
+
**nonspecified (NSPC)
*[[information structure]] (ISTA)
+
**specificied (SPC)
**@comment: what is being said about the topic
+
*[[degree]] (DEG)
**@focus: information that is contrary to the presuppositions of the interlocutor
+
**augmentative (AUG)
**@topic: what is being talked about
+
**comparative (CMP)
*[[lexical category]] (LEXA)
+
**diminutive (DIM)
**@adjective
+
**positive (PST)
**@adverb
+
**superlative (SUP)
**@noun
+
***absolute superlative (SUPA)
**@verb
+
***comparative superlative (SUPR)
*[[manner]] (HOW)
+
*[[distribution]] (DIS)
**@according_to
+
**after (AFT)
**@against
+
**before (BEF)
**@although
+
**end (END)
**@and
+
**free (FRE)
**@as
+
**front (FRT)
**@as.@if
+
**immediately after (IAFT)
**@as_far_as
+
**immediately before (IBEF)
**@as_of
+
**middle (MID)
**@as_per
+
*[[information structure]] (IST)
**@as_regards
+
**focus (FOC)
**@as_well_as
+
**rheme (RHE)
**@barring
+
**theme (THE)
**@because
+
*[[gender]] (GEN)
**@because_of
+
**feminine (FEM)
**@besides
+
**masculine (MCL)
**@but
+
**neuter (NEU)
**@by
+
**common (COM)
**@by_means_of
+
**variable (VAR)
**@concerning
+
*[[lexical category]] (LEX)
**@despite
+
**[[adjective]] (J)
**@due_to
+
**[[adposition]] (P)
**@even.@if
+
**[[adverb]] (A)
**@except
+
**[[affix]] (F)
**@except.@if
+
**[[conjunction]] (C)
**@except_for
+
**[[determiner]] (D)
**@excluding
+
**[[inflection]] (I)
**@failing
+
**[[noun]] (N)
**@for
+
**[[numeral]] (U)
**@given
+
**[[noun|proper noun]] (E)
**@if
+
**[[pronoun]] (R)
**@if.@only
+
**[[verb]] (V)
**@in_accordance_with
+
**other (O)
**@in_addition_to
+
*[[lexical structure]] (LST)
**@in_case
+
**subword (SBW)
**@in_case_of
+
**simple word (WRD)
**@in_favor_of
+
***abbreviation (ABB) and single-word contraction
**@in_place_of
+
***clitic (CLI)
**@in_spite_of
+
**multiword expression (MTW)
**@including
+
***acronym (ACR) and initialism
**@instead_of
+
***multiple-word contraction (CTT) and blend
**@like
+
*[[modality]] (MOD)
**@notwithstanding
+
**realis (REA)
**@off
+
**irrealis (NREA)
**@on_account_of
+
**alethic (ALE)
**@on_behalf_of
+
**deontic (DEO)
**@or
+
***comissive (CMS)
**@owing_to
+
***directive (DRT)
**@pace
+
***volitive (VLT)
**@per
+
**epistemic (EPI)
**@pursuant_to
+
***evidentiality (EVI)
**@qua
+
***judgment (JDG)
**@regarding
+
*[[mood]] (MOO)
**@regardless_of
+
**none (non-finite verb forms) (VBL)
**@save
+
***gerund (GER)
**@so
+
***gerundive (GDV)
**@than
+
***infinitive (INF)
**@thanks_to
+
***participle (PTP)
**@that_of
+
***supine (SPN)
**@unless
+
**assumptive (AUM)
**@unlike
+
**causative (CAU)
**@versus
+
**conditional (CON)
**@with
+
**declarative (DEC)
**@with_regard_to
+
**deductive (DED)
**@with_relation_to
+
**deliberative (DLB)
**@with_respect_to
+
**dubitative (DUB)
**@without
+
**hypothetical (HYP)
**@worth
+
**imperative (IMP)
*[[modality]] (MODA)
+
**imprecative (IPC)
**@ability
+
**indicative (IND)
**@advice
+
**inferential (INFR)
**@agreement
+
**interrogative (INT)
**@assertion
+
**jussive (JUS)
**@assumption
+
**obligative (OBM)
**@belief
+
**optative (OPT)
**@command
+
**permissive (PMS)
**@conclusion
+
**potential (POT)
**@condition
+
**precative (PCT)
**@confirmation
+
**prohibitive (PHB)
**@consequence
+
**speculative (SPT)
**@conviction
+
**subjunctive (SUB)
**@decision
+
*[[morphology]] (MOR)
**@deduction
+
**affix (AFF)
**@desire
+
***inflectional affix (IAX)
**@determination
+
***derivational affix (DAX)
**@doubt
+
**base form (BF)
**@exclamation
+
***root (ROO)
**@exhortation
+
***stem (STE)
**@expectation
+
**word form (WFO)
**@fear
+
**alternative form (ALT)
**@hope
+
***alternative form 1 (ALT1)
**@hypothesis
+
***alternative form 2 (ALT2)
**@intention
+
***alternative form 3 (ALT3)
**@interrogation
+
***short or weak form (SHO)
**@invitation
+
***long or strong form (STR)
**@judgement
+
*[[number]] (NUM)
**@narrative
+
**singular (SNG)
**@necessity
+
***singulare tantum (SNGT)
**@obligation
+
**plural (PLR)
**@opinion
+
***dual (DUA)
**@permission
+
***trial (TRI)
**@possibility
+
***quadrual (QDR)
**@prediction
+
***paucal (PAU)
**@presumption
+
***multal (MUL)
**@probability
+
***plurale tantum (PLRT)
**@prohibition
+
**invariant (INV)
**@promise
+
*[[part of speech]] (POS)
**@regret
+
**[[adjective]]s (J)
**@request
+
***adjective (ADJ)
**@speculation
+
***participle (PTL)
**@suggestion
+
**[[adposition]] (P)
**@threat
+
***circumposition (CIR)
**@warning
+
***postposition (PPS)
*[[nominal attributes] (NOUA)
+
***preposition (PRE)
**@about
+
**[[adverb]] (A)
**@round
+
***specifier adverb (SAV)
**@of
+
***adjunct adverb (AAV)
*[[person]] (PERA)
+
***conjunct (CJT)
**@1 (first person: speaker)
+
***disjunct (DJT)
**@2 (second person: addressee)
+
**[[affix]] (F)
**@3 (third person)
+
***circumfix (CCX)
*[[place]] (WHERE)
+
***infix (IFX)
**location
+
***prefix (PFX)
***@above
+
***suffix (SFX)
***@among
+
**[[conjunction]] (C)
***@around
+
***coordinating conjunction (COO)
***@at
+
****correlative conjunction (CRC)
***@back
+
***subordinating conjunction (SCJ)
***@behind
+
****adverbializer (AVR)
***@below
+
****complementizer (CMR)
***@beside
+
****relativizer (RVZ)
***@between
+
**[[determiner]] (D)
***@bottom
+
***article (ART)
***@front
+
***demonstrative determiner (DEM)
***@in
+
***possessive determiner (POD)
***@in_front_of
+
***quantifier (QUA)
***@in_place_of
+
**inflection (I)
***@inside
+
***auxiliary verb (AUX)
***@left
+
****modal verb (MOV)
***@near_to
+
**[[noun]] (N)
***@on
+
***common noun (NOU)
***@on_top_of
+
**[[noun|proper noun]] (E)
***@opposite
+
***proper noun (PPN)
***@outside
+
**[[numeral]] (U)
***@over
+
***DIGIT (digits)
***@right
+
****DOZEN (used to deal with dozens)
***@side
+
****HUNDRED (used to deal with hundreds)
***@top
+
***cardinal numeral (CDN)
***@under
+
***distributive numeral (DTN)
***@within
+
***partitive numeral (PTN)
**position
+
***multiplicative numeral (MLN)
***@contact
+
***ordinal numeral (ORD)
***@far
+
**[[pronoun]] (R)
***@near
+
***demonstrative pronoun (DEP)
**direction
+
***dummy pronoun (DUM)
***@across
+
***emphatic pronoun (EPR)
***@along
+
***indefinite pronoun (NPR)
***@clockwise
+
***interrogative pronoun (IPR)
***@down
+
***personal pronoun (PPR)
***@from
+
***possessive pronoun (SPR)
***@through
+
***reciprocal pronoun (CPR)
***@throughout
+
***reflexive pronoun (FPR)
***@to
+
***relative pronoun (RPR)
***@towards
+
**[[verb]] (V)
***@up
+
***full verb (VER)
*[[polarity]] (POLA)
+
***copula (COP)
**@yes (affirmative)
+
**other (O)
**@not (negative)
+
***classifier (CLA)
**@maybe (dubitative)
+
***interjection (ITJ)
*[[quantification]] (QUAA)
+
***particle (PTC)
**@any (any) (existential quantifier)
+
***punctuation (PUT)
**@all (all) (universal quantifier)
+
****blank (BLK)
**@entire (entire)
+
****<nowiki>' </nowiki>(APOSTROPHE)
**@generic (no quantification)
+
****<nowiki>- </nowiki>(HYPHEN)
**@half (half)
+
****<nowiki>! </nowiki>(EMARK)
**@majority (a major part)
+
****<nowiki>" </nowiki>(QUOTE)
**@minority (a minor part)
+
****<nowiki># </nowiki>(HASH)
**@no (none)
+
****<nowiki>$ </nowiki>(DOLLAR)
**@part (part)
+
****<nowiki>% </nowiki>(PERCENTAGE)
**@pl (plural)
+
****<nowiki>& </nowiki>(AMPERSAND)
***@dual
+
****<nowiki>( </nowiki>(OPARENTHESIS)
***@trial
+
****<nowiki>) </nowiki>(CPARENTHESIS)
***@quadrual
+
****<nowiki>* </nowiki>(ASTERISK)
***@paucal
+
****<nowiki>, </nowiki>(COMMA)
***@multal
+
****<nowiki>. </nowiki>(PERIOD)
**@singular (default)
+
****<nowiki>/ </nowiki>(FSLASH)
**@times (multiplicative)
+
****<nowiki>: </nowiki>(COLON)
**@tuple (collective)
+
****<nowiki>; </nowiki>(SEMICOLON)
**@unit (unit)
+
****<nowiki>? </nowiki>(QMARK)
*[[register]] (REGA)
+
****<nowiki>[ </nowiki>(OSBRACKET)
**@archaic
+
****<nowiki>\ </nowiki>(BSLASH)
**@colloquial
+
****<nowiki>] </nowiki>(CSBRACKET)
**@dialect
+
****<nowiki>{ </nowiki>(OCBRACE)
**@jargon
+
****<nowiki>} </nowiki>(CCBRACE)
**@literary
+
****<nowiki>€ </nowiki>(EURO)
**@pejorative
+
****<nowiki>+ </nowiki>(PLUS)
**@slang
+
****<nowiki>< </nowiki>(LTHAN)
**@taboo
+
****<nowiki>= </nowiki>(EQUAL)
*[[social deixis]] (SODA)
+
****<nowiki>> </nowiki>(GTHAN)
**@equivalent
+
*[[person]] (PER)
**@familiar
+
**impersonal (NPER)
**@inferior
+
**first person (1PER)
**@intimate
+
***first person singular (1PS)
**@polite
+
***first person plural (1PP)
**@reverential
+
****123PP (me, you and others)
**@superior
+
****13PP (me and others)
*[[specification]] (WHICH)
+
**second person (2PER)
**@also (also)
+
***second person singular (2PS)
**@circa
+
***second person plural (2PP)
**@def (definite)
+
**third person (3PER)
***@both (both)
+
***third person singular (3PS)
***@distal (far from the speaker)
+
***third person plural (3PP)
***@each (each)
+
*[[polarity]] (POL)
***@either (either)
+
**affirmative (AFM)
***@medial (near the addressee)
+
**negative (NEG)
***@other (other)
+
*[[register]] (REG)
***@own (own)
+
**archaism (ARC)
***@proximal (near the speaker)
+
**colloquialism (CLQ)
***@same (same)
+
**dialect (DIA)
***@such (such)
+
**jargon (JGN)
**@even
+
**literary (LIT)
**@indef (indefinite)
+
**pejorative (PEJ)
***@certain (certain)
+
**slang (SLG)
***@wh
+
**taboo (TAB)
**@neither
+
*[[social deixis]] (SOD)
**@only
+
**solidarity (SOL)
**@ordinal (ordinal)
+
***familiar (FAM)
*syntactic structures (SYNA)
+
***intimate (ITM)
**conventions
+
***polite (PLN)
***@angle_bracket
+
**status (STS)
***@brace
+
***equivalent (EVL)
***@double_parenthesis
+
***inferior (IFS)
***@double_quote
+
***reverential (REV)
***@parenthesis
+
***superior (SPS)
***@single_quote
+
*[[syntactic roles]] (SYN)
***@square_bracket
+
**adjunct (XA)
**@entry (sentence head)
+
***adjunct to the head of an adjective phrase (JA)
**@relative (relative clause head)
+
***adjunct to the head of an adverbial phrase (AA)
**@speech (direct speech)
+
***adjunct to the head of a complementizer phrase (CA)
**@title (title)
+
***adjunct to the head of a determiner phrase (DA)
**@vocative (vocative)
+
***adjunct to the head of an inflectional phrase (IA)
*[[time]] (WHEN)
+
***adjunct to the head of a nominal phrase (NA)
**absolute tense
+
***adjunct to the head of a prepositional phrase (PA)
***@past: at a time before the moment of utterance
+
***adjunct to the head of a verbal phrase (VA)
***@present: at the moment of utterance
+
**complement (XC)
***@future: at a time after the moment of utterance
+
***complement of the head of an adjective phrase (JC)
***@recent: close to the moment of utterance
+
***complement of the head of an adverbial phrase (AC)
***@remote: remote from the moment of utterance
+
***complement of the head of a complementizer phrase (CC)
**relative tense
+
***complement of the head of a determiner phrase (DC)
***@anterior: before some other time other than the time of utterance
+
***complement of the head of an inflectional phrase (IC)
***@posterior: after some other time other than the time of utterance
+
***complement of the head of a nominal phrase (NC)
**other
+
***complement of the head of a prepositional phrase (PC)
***@after
+
***complement of the head of a verbal phrase (VC)
***@before
+
**head (XH)
***@during
+
***head of an adverbial phrase (AH)
***@following
+
***head of an adjective phrase (JH)
***@prior_to
+
***head of a complementizer phrase (CH)
***@since
+
***head of a determiner phrase (DH)
***@subsequent_to
+
***head of an inflectional phrase (IH)
***@until
+
***head of a nominal phrase (NH)
*[[voice]] (VOIA)
+
***head of a prepositional phrase (PH)
**@active: He built this house in 1895
+
***head of a verbal phrase (VH)
**@passive: This house was built in 1895.
+
**specifier (XS)
**@reflexive: He killed himself.
+
***specifier of the head of an adjective phrase(JS)
**@reciprocal: They killed each other.
+
***specifier of the head of an adverbial phrase (AS)
 +
***specifier of the head of a complementizer phrase (CS)
 +
***specifier of the head of a determiner phrase(DS)
 +
***specifier of the head of an inflectional phrase (IS)
 +
***specifier of the head of a nominal phrase (NS)
 +
***specifier of the head of a prepositional phrase (PS)
 +
***specifier of the head of a verbal phrase (VS)
 +
**maximal projection (XP)
 +
***adjective phrase (JP)
 +
***adverbial phrase (AP)
 +
***complementizer phrase (CP)
 +
***determiner phrase (DP)
 +
***inflectional phrase (IP)
 +
***nominal phrase (NP)
 +
***prepositional phrase (PP)
 +
***verbal phrase (VP)
 +
**intermediate projection (XB)
 +
***adverbial phrase (AB)
 +
***adjective phrase (JB)
 +
***complementizer phrase (CB)
 +
***determiner phrase (DB)
 +
***inflectional phrase (IB)
 +
***nominal phrase (NB)
 +
***prepositional phrase (PB)
 +
***verbal phrase (VB)
 +
**trace (TRACE)
 +
*[[tense]] (TNS)
 +
**absolute tense (ATE)
 +
***past (PAS)
 +
***present (PRS)
 +
****preterit (PTR)
 +
****hesternal past tense (HEP)
 +
****prehesternal past tense (PEP)
 +
****hodiernal past tense (HOP)
 +
****prehodiernal past tense (POP)
 +
****immediate past tense (IPT)
 +
****nonrecent past tense (NRCP)
 +
****recent past tense (RCP)
 +
****nonremote past tense (NRMP)
 +
****remote past tense (RMP)
 +
***future (FUT)
 +
****near future (FUN)
 +
****remote future (FUR)
 +
***nonpast (NPAS)
 +
***nonfuture (NFUT)
 +
***still (STL)
 +
***not-yet (NYET)
 +
**relative tense (RTE)
 +
***relative past (RPT)
 +
***relative nonpast (NRPT)
 +
***relative present (RPS)
 +
***relative future (RFT)
 +
***relative nonfuture (NRFT)
 +
*[[transitivity]] (TRA)
 +
**no transitivity (NTRA) (linking verb)
 +
**transitive (TST)
 +
***direct transitive (TSTD)
 +
***indirect transitive (TSTI)
 +
***ditransitive (TST2)
 +
***tritransitive (TST3)
 +
**intransitive (NTST)
 +
***unergative (NERG)
 +
***unaccusative (NACC)
 +
*[[Universal Attribute]]s (att)
 +
**animacy attributes (ANIA)
 +
**aspect attributes (ASPA)
 +
**degree attributes (DEGA)
 +
**emotion attributes (FEEL)
 +
**figure of speech attributes (FIGA)
 +
**gender attributes (GENA)
 +
**information structure attributes (ISTA)
 +
**lexical attributes (LEXA)
 +
**manner attributes (HOW)
 +
**modality attributes (MODA)
 +
**person attributes (PERA)
 +
**polarity attributes (POLA)
 +
**place attributes (WHERE)
 +
**quantification attributes (QUAA)
 +
**register attributes (REGA)
 +
**social deixis attributes (SODA)
 +
**specification attributes (WHICH)
 +
**syntactic structures (SYNA)
 +
**time attributes (WHEN)
 +
**voice attribute (VOIA)
 +
*[[Universal Relations]] (rel)
 +
*[[Universal Words]] (SEM)
 +
**Adjective concepts
 +
***age (AGE)
 +
***colour (COR)
 +
***dimension (DMS)
 +
***human propensity (HPP)
 +
***physical property (PHY)
 +
***speed (SPD)
 +
***value (VLE)
 +
***other adjectives (JJJ)
 +
**Adverbial concepts
 +
***degree (DGR)
 +
***manner (MAN)
 +
***place (PLE)
 +
***time (TME)
 +
***other adverbs (AAA)
 +
**Nominal concepts
 +
***act or action (ACT)
 +
***animal (ANL)
 +
***artifact (ARF) (man-made objects)
 +
***attribute (ATR) (of people and objects)
 +
***body part (BON)
 +
***cognitive processes and contents (CGN)
 +
***communicative processes and contents (CMN)
 +
***feelings and emotions (FEE)
 +
***foods and drinks (FOO)
 +
***groupings of people or objects (GRO)
 +
***location (LCT) (spatial position)
 +
***motive (MTV) (goals)
 +
***natural events (NEV)
 +
***natural objects (NOB) (non man-made objects)
 +
***natural phenomena (PHE)
 +
***plant (PLA)
 +
***possession or transfer of possession (PON)
 +
***natural process (NAT)
 +
***person (HUM)
 +
***quantities and units of measure (QTT)
 +
***relations between people or things or ideas (REL)
 +
***substance (SBS)
 +
***shape (SHA) (two or three-dimensional shapes)
 +
***state (STA) (stable states of affairs)
 +
***time and temporal relations (TIM)
 +
**Verbal concepts
 +
***body action (BOV)
 +
***cognitive verb (CGV)
 +
***change (CHA)
 +
***communication verb (CMV)
 +
***competition (CPT)
 +
***creation (CRE)
 +
***consumption (CSM)
 +
***contact (CTC)
 +
***emotion (EMO)
 +
***motion (MOT)
 +
***perception (PCP)
 +
***possession verb (POV)
 +
***social (SOC)
 +
***stative (STT)
 +
***weather (WEA)
 +
*[[valency]] (VAL)
 +
**avalent (VAL0)
 +
**monovalent (VAL1)
 +
**divalent (VAL2)
 +
**trivalent (VAL3)
 +
**tetravalent (VAL4)
 +
*[[voice]] (VOI)
 +
**active voice (ACV)
 +
**middle voice (MIV)
 +
**passive voice (PSV)
 +
*other
 +
**System-defined values
 +
***CHEAD (beginning of a scope)
 +
***CTAIL (end of a scope)
 +
***DIGIT (digits)
 +
***SCOPE (scope)
 +
***SHEAD (beginning of the sentence)
 +
***STAIL (end of the sentence)
 +
***TEMP (temporary entry - not found in the dictionary)
 +
**Grammar-related attributes
 +
***FLX (inflectional rules)
 +
***FRA (subcategorization frame)
 +
***GOV (subcategorization rules)
 +
***PAR (inflectional paradigm)
 +
***SFR (semantic frame)
 
}}
 
}}

Latest revision as of 13:03, 19 May 2015

The set of features in a UNL-driven dictionary depends on the structure of the natural language and may vary a lot. However, in order to better standardize lexical resources inside the UNL framework, the UNDL Foundation recommends the adoption of the following tags for some specific and pervasive grammatical phenomena. Several of those linguistic constants have been already proposed to the Data Category Registry (ISO 12620), and represent widely accepted linguistic concepts. Our main intention here is just to provide a harmonized system to be shared by the UNL community so as to make dictionaries as easily understandable and exchangeable as possible.

When to use the UNDLF Tagset

The UNDLF Tagset is required for providing lexical resources (dictionary entries and grammar rules) in the UNLarium framework. Indeed, the whole environment has been already prepared to accept only the tags here presented. In most cases, the use of tags is rather unnoticeable and effortless, since users are supposed to make higher-level choices ("adjective", for instance) which will be internally represented through the corresponding authorized labels ("ADJ"). However, in several circumstances, as when creating inflectional paradigms or subcategorization frames, users are expected to address more fine-grained linguistic phenomena that may require a specialized metalanguage. That's exactly the purpose of this tagset: to provide the technical means for describing any linguistic behaviour. And it should do that in a strongly standardised way, i.e., so that others could easily understand and exploit the data for their own benefit.

General Guidelines

In order to define the tags to be used in the UNDLF Tagset, the following premises were adopted:

  • Tags should be as comprehensive as possible (i.e., they should cover all widely accepted linguistic concepts)
  • Tags should be as few as possible (i.e., they should avoid redundancy)
  • Tags should be as short as possible (i.e., they should fit in a three-character string)
  • Tags should be as mnemonic as possible (i.e., they should be provided through English acronyms or abbreviations)
  • Tags should constitute a taxonomic hierarchy (so that upper level values could be inferred from the lower ones).

Additionally, the following conventions were adopted:

  • Tags are written in upper case letters;
  • Negation is represented by prefixation with "N-" (past = PAS, nonpast = NPAS).

We have tried to stick to the standard abbreviations proposed by the Leipzig Glossing Rules and by David Crystal in A dictionary of Linguistics and Phonetics (2008), as much as they comply with the rules above. The resulting set of tags, which is still subject to additions and revisions, is presented below. For the time being, the definitions and examples have been extracted out of the Glossary of Linguistic Terms (Loos et alii), available at SIL International. The tags are expected to migrate to an on-line environment, still under construction, where accredited linguists will have the opportunity to enhance and to improve this repertoire.

Tree of attributes and values

The hierarchy of tags is depicted in the tree below. The topmost level represents the attributes of which the tags are a value. Lower positions subsume upper levels (for instance: progressive is a value of continuative, which is a value of imperfective, which is a value of the attribute aspect), but are not mandatory, as they can be too specialized ("go" is just a verb, and not any of the subcategories of verb). In any case, natural language phenomena should be classified as deep as possible in the tagset structure ("un-" should be classified as a prefix, rather than as an affix).

List of tags in alphabetical order

Software