Tagset

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Tree of attributes and values)
 
(246 intermediate revisions by 2 users not shown)
Line 1: Line 1:
The set of features in a UNL-driven dictionary depends on the structure of the natural language and may vary a lot. However, in order to better standardize lexical resources inside the UNL framework, the UNDL Foundation recommends the adoption of the following tags for some specific and pervasive grammatical phenomena. Several of those linguistic constants have been already proposed to the '''Data Category Registry''' (ISO 12620), and represent widely accepted linguistic concepts. Our main intention here is just to provide a harmonized system to be shared by the UNL community so as to make dictionaries as easily understandable as possible.
+
The set of features in a UNL-driven dictionary depends on the structure of the natural language and may vary a lot. However, in order to better standardize lexical resources inside the UNL framework, the UNDL Foundation recommends the adoption of the following tags for some specific and pervasive grammatical phenomena. Several of those linguistic constants have been already proposed to the '''Data Category Registry''' (ISO 12620), and represent widely accepted linguistic concepts. Our main intention here is just to provide a harmonized system to be shared by the UNL community so as to make dictionaries as easily understandable and exchangeable as possible.
 +
 
 +
== When to use the UNDLF Tagset ==
 +
 
 +
The UNDLF Tagset is required for providing lexical resources (dictionary entries and grammar rules) in the [http://www.undlfoundation.org/unlarium UNLarium] framework. Indeed, the whole environment has been already prepared to accept only the tags here presented. In most cases, the use of tags is rather unnoticeable and effortless, since users are supposed to make higher-level choices ("adjective", for instance) which will be internally represented through the corresponding authorized labels ("ADJ"). However, in several circumstances, as when creating inflectional paradigms or subcategorization frames, users are expected to address more fine-grained linguistic phenomena that may require a specialized metalanguage. That's exactly the purpose of this tagset: to provide the technical means for describing any linguistic behaviour. And it should do that in a strongly standardised way, i.e., so that others could easily understand and exploit the data for their own benefit.
  
 
== General Guidelines ==  
 
== General Guidelines ==  
  
In order to define the tags to be used in the UNL Tagset, the following premises were adopted:
+
In order to define the tags to be used in the UNDLF Tagset, the following premises were adopted:
* Tags should be as few as possible
+
*Tags should be as comprehensive as possible (i.e., they should cover all widely accepted linguistic concepts)
* Tags should be as short as possible  
+
*Tags should be as few as possible (i.e., they should avoid redundancy)
* Tags should be as mnemonic as possible
+
*Tags should be as short as possible (i.e., they should fit in a three-character string)
 +
*Tags should be as mnemonic as possible (i.e., they should be provided through English acronyms or abbreviations)
 +
*Tags should constitute a taxonomic hierarchy (so that upper level values could be inferred from the lower ones).
 +
 
 +
Additionally, the following conventions were adopted:
 +
*Tags are written in upper case letters;
 +
*Negation is represented by prefixation with "N-" (past = PAS, nonpast = NPAS).
 +
 
 +
We have tried to stick to the standard abbreviations proposed by the [http://www.eva.mpg.de/lingua/resources/glossing-rules.php Leipzig Glossing Rules] and by David Crystal in ''A dictionary of Linguistics and Phonetics'' (2008), as much as they comply with the rules above. The resulting set of tags, which is still subject to additions and revisions, is presented below. For the time being, the definitions and examples have been extracted out of the ''Glossary of Linguistic Terms'' (Loos et alii), available at [http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/ SIL International]. The tags are expected to migrate to an on-line environment, still under construction, where accredited linguists will have the opportunity to enhance and to improve this repertoire.
 +
 
 +
== Tree of attributes and values ==
  
These assumptions led us to the following general guidelines:
+
The hierarchy of tags is depicted in the tree below. The topmost level represents the attributes of which the tags are a value. Lower positions subsume upper levels (for instance: progressive is a value of continuative, which is a value of imperfective, which is a value of the attribute aspect), but are not mandatory, as they can be too specialized ("go" is just a verb, and not any of the subcategories of verb). In any case, natural language phenomena should be classified as deep as possible in the tagset structure ("un-" should be classified as a prefix, rather than as an affix).
* Tags should be made of a three-character upper-case string (except for negative values, which should be preceded by "N", such as NPFC = non-perfect);
+
* Tags should be labelled out of English words;
+
* Tags should be provided in a attribute-value structure, along with definitions and examples.
+
  
The resulting set of tags, which is still subject to additions and revisions, is presented below. For the time being, the definitions and examples have been extracted out of the ''Glossary of Linguistic Terms'' (Loos et alii), available at [http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/ SIL International], and are expected to migrate to an online environment, still under construction, where accredited linguists will have the opportunity of improving this repertoire.
+
[http://www.unlweb.net/unlarium/dictionary/export_tagset.php List of tags in alphabetical order]
  
== List of tags (in alphabetical order)==
+
{{#tree:id=tagset|openlevels=0|root=Tags|
  
{
+
*[[abstractness]] (ABN)
|-1PP|First person plural|Deictic reference that refers to both the speaker and referents grouped with the speaker.|we
+
**abstract (ABT)
|-1PS|First person singular|Deictic reference that refers to the speaker.|I
+
**concrete (CCT)
|-2PP|Second person plural|Deictic reference to more than one referent identified as addressee.|you
+
*[[adjacency]] (AJC)
|-2PS|Second person singular|Deictic reference to a single referent identified as addressee.|you
+
**immediate (AJ0)
|-3PP|Third person plural|Deictic reference to more than one referent not identified as the speaker or addressee.|they
+
**nearest (AJ1)
|-3PS|Third person singular|Deictic reference to a single referent not identified as the speaker or addressee.|he
+
**near (AJ2)
|-AA|Adjunct to an adverb|An optional constituent of an adverbial phrase.|
+
**distant (AJ3)
|-AB|Adverbial Phrase (Intermediate Projection)||
+
**most distant (AJ4)
|-ABB|Abbreviation||Dr.
+
*[[agreement]] (AGR)
|-ABE|Abessive|A case that expresses the lack or absence of the referent of the noun it marks|
+
**assigns case (ACAS)
|-ABL|Ablative|A case that indicates movement from something, and/or cause|
+
**assigns gender (AGEN)
|-ABS|Abstract|A noun that denotes something viewed as a nonmaterial referent|
+
**assigns number (ANUM)
|-AC|Complement of an adverb|A phrasal or clausal category which is selected (subcategorized) by the head of a phrase.|
+
**assigns person (APER)
|-ACAS|Assigns Case|Used to indicate case agreement|
+
**assigns tense (ATNS)
|-ACC|Accusative|A case that indicates the direct object of a verb|him (in I saw him)
+
**receives case (RCAS)
|-ACR|Acronym||UNL
+
**receives gender (RGEN)
|-ACT|Acts or actions|Nouns denoting acts or actions|
+
**receives number (RNUM)
|-ACV|Active voice|When the subject is the agent or actor of the verb.|
+
**receives person (RPER)
|-ADJ|Adjective|Modifiers of nouns.|beautiful
+
**receives tense (RTNS)
|-ADV|Adverb|Modifiers of verbs and other constituent classes.|beautifully
+
*alienability (ALY)
|-AGEN|Assigns Gender|Used to indicate gender agreement|
+
**alienable (ALI)
|-ALL|Allative|A case that expresses motion to or toward the referent of the noun it marks.|
+
**unalienable (NALI)
|-ANL|Animal|Nouns denoting animals|
+
*[[animacy]] (ANI)
|-ANM|Animate|Indicates an animate reference|he, she
+
**animate (ANM)
|-ANUM|Assigns Number|Used to indicate number agreement|
+
**inanimate (NANM)
|-AP|Adverbial Phrase (Maximal Projection)||
+
*[[aspect]] (ASP)
|-APER|Assigns Person|Used to indicate person agreement|
+
**aorist (AOR)
|-ARF|Artifact|Nouns denoting man-made objects|
+
**causative (CAU)
|-ART|Article|Determiner that identifies a noun's definite or indefinite reference, and new or given status.|the
+
**perfective (PFV)
|-AS|Specifier of an adverb||
+
**imperfective (NPFV)
|-ASL|Absolutive|Case of nouns in ergative-absolutive languages that would generally be the subjects of intransitive verbs or the objects of transitive verbs in the translational equivalents of nominative-accusative languages such as English.|
+
***continuative (CTN)
|-ASP|ASPECT|The grammatical aspect (sometimes called viewpoint aspect) of a verb defines the temporal flow (or lack thereof) in the described event or state. In English, for example, the past-tense sentences "I swam" and "I was swimming" differ in aspect (the first sentence is in what is called the perfective or completive aspect, and the second in what is called the imperfective or durative aspect).|
+
****progressive (PGS)
|-ATST|Ambitransitive|A verb that can be used both as intransitive or as transitive without requiring a morphological change|read
+
***habitual (HAB)
|-ATT|Attribute|Nouns denoting attributes of people and objects|
+
***iterative (ITE)
|-AUX|Auxiliary verb|A verb which accompanies the lexical verb of a verb phrase, and expresses grammatical distinctions not carried by the lexical verb.|will
+
**perfect (PFC)
|-BEN|Benefactive|A case that expresses that the referent of the noun it marks receives the benefit of the situation expressed by the clause|
+
***experiential perfect aspect (EXP)
|-BON|Body parts|Nouns denoting body parts|
+
***perfect of persistent situation (PSS)
|-BOV|Body actions|Verbs of grooming, dressing and bodily care|
+
***perfect of recent past (PRP)
|-CA|Adjunct to a conjunction|An optional constituent of a complementizer phrase.|
+
***perfect of result (RES)
|-CAS|CASE|The case of a noun or pronoun indicates its grammatical function in a greater phrase or clause such as the role of subject or of direct object.|
+
**prospective (PPT)
|-CAU|Causative|A case which expresses that the referent of the noun it marks is the cause of the situation expressed by the clause.|
+
**inceptive (ICP)
|-CB|Conjunctional Phrase (Intermediate Projection)||
+
**terminative (TER)
|-CC|Complement of a conjunction|A phrasal or clausal category which is selected (subcategorized) by the head of a phrase.|
+
*cardinality (CAR)
|-CDN|Cardinal numeral|A numeral of the class whose members are considered basic in form, are used in counting, and in expressing how many objects are referred to.|two
+
**one single referent (ONE)
|-CGN|Cognition nouns|Nouns denoting cognitive processes and contents|
+
**a pair of referents (TWO)
|-CGV|Cognition verbs|Verbs of thinking, judging, analyzing, doubting|
+
**three referents (TRE)
|-CHA|Change|Verbs of size, temperature change, intensifying, etc.|
+
**countable (CTB)
|-CIR|Circumposition||
+
**uncountable (NCTB)
|-CMN|Communication nouns|Nouns denoting communicative processes and contents|
+
**collective (COL)
|-CMP|Comparative|An adjective that compares the quality with that of another of its kind|better
+
**more than one referent (PLU)
|-CMT|Comitative|A case expressing accompaniment.|
+
*[[case]] (CAS)
|-CMV|Communication verbs|Verbs of telling, asking, ordering, singing|
+
**abessive (ABE)
|-CON|Conditional mood|The form of the verb used in conditional sentences to refer to a hypothetical state of affairs, or an uncertain event that is contingent on another set of circumstances.|
+
**ablative (ABL)
|-COO|Coordinating conjunction|A conjunction that links constituents without syntactically subordinating one to the other.|and
+
**accusative (ACC)
|-COP|Copula|An intransitive verb which links a subject to a noun phrase adjective, or other constituent which expresses the predicate.|be (to be)
+
**adessive (ADE)
|-CP|Conjunctional Phrase (Maximal Projection)||
+
**allative (ALL)
|-CPR|Reciprocal pronoun|A reciprocal pronoun is a pronoun that expresses a mutual feeling or action among the referents of a plural subject.|They hit [each other].
+
**absolutive (ABS)
|-CPT|Competition|Verbs of fighting, athletic activities|
+
**benefactive (BEN)
|-CRE|Creation|Verbs of sewing, baking, painting, performing|
+
**comitative (CMT)
|-CS|Specifier of a conjunction||
+
**construct state (CTS)
|-CSP|Consumption|Verbs of eating and drinking|
+
**dative (DAT)
|-CTC|Contact|Verbs of touching, hitting, tying, digging|
+
**delative (DEL)
|-CTN|Continuative||I am still eating.
+
**elative (ELA)
|-CTT|Contraction||don't
+
**equative (EQU)
|-DA|Adjunct of a determiner|An optional constituent of a determiner phrase.|
+
**ergative (ERG)
|-DAT|Dative case|A case that indicates the indirect object of a verb|us (in He gave us the book)
+
**essive (ESS)
|-DB|Determiner Phrase (Intermediate Projection)||
+
**genitive (GNT)
|-DC|Complement of a determiner|A phrasal or clausal category which is selected (subcategorized) by the head of a phrase.|
+
**hortative (HOR)
|-DEF|Definite|Specific and identifiable in a given context|the
+
**illative (ILL)
|-DEG|DEGREE|Describes the relational value of one thing with something in another clause of a sentence.|
+
**inessive (INE)
|-DEL|Delative|A case which expresses motion downward from the referent of the noun it marks.|
+
**instrumental (INS)
|-DEM|Demonstrative|A determiner that is used deictically to indicate a referent's spatial, temporal, or discourse location.|this
+
**lative (LAT)
|-DP|Determiner Phrase (Maximal Projection)||
+
**locative (LOC)
|-DS|Specifier of a determiner||
+
**nominative (NOM)
|-DTST|Ditransitive|A verb which takes a subject and two objects.|give
+
**oblique (OBL)
|-DUA|Dual|Number which refers to two members of the class identified by the noun.|
+
**prolative (PLT)
|-ELA|Elative|A case expressing motion out of or away from the referent of the noun it marks.|
+
**prepositional (PPL)
|-EMO|Emotion|Verbs of feeling|
+
**partitive (PTT)
|-EPR|Emphatic pronoun|An emphatic pronoun is a personal pronoun that is used to emphasize its referent.|[Moi], je suis Français.
+
**superessive (SPE)
|-EQU|Equative|A case that expresses likeness or identity to the referent of the noun it marks.|
+
**terminative (TRM)
|-ERG|Ergative|The case of nouns in ergative-absolutive languages that would generally be the subjects of transitive verbs in the translation equivalents of nominative-accusative languages such as English.|
+
**translative (TLT)
|-ESS|Essive|A case that expresses the temporary state of the referent specified by a noun.|
+
**vocative (VOC)
|-ET0|Past event tense|An absolute tense that refers to a time before the moment of utterance.|was (I was here)
+
*defineteness (DFN)
|-ET1|Present event tense|Absolute tense that refers to the moment of utterance|am (I am here)
+
**definite (DEF)
|-ET2|Future event tense|An absolute tense that refers to a time after the moment of utterance.|will be (I will be here)
+
**generic (GNR)
|-EVT|EVENT TENSE|A temporal linguistic quality expressing the time at, during, or over which a state or action denoted by a verb occurs with reference to the speaker.|
+
**indefinite (NDEF)
|-FEE|Feelings|Nouns denoting feelings and emotions|
+
**nonspecified (NSPC)
|-FEM|Feminine|A grammatical gender that marks nouns that have human or animal female referents, and often marks nouns that have referents that do not carry distinctions of sex.|she
+
**specificied (SPC)
|-FOO|Food|Nouns denoting foods and drinks|
+
*[[degree]] (DEG)
|-FPR|Reflexive pronoun|A reflexive pronoun is a pronoun that has coreference with the subject.|He prides [himself] on his appearance.
+
**augmentative (AUG)
|-FRA|Fraction numeral||two thirds
+
**comparative (CMP)
|-GEN|GENDER||
+
**diminutive (DIM)
|-GER|Gerund||sleeping
+
**positive (PST)
|-GNT|Genitive|A case in which the referent of the marked noun is the possessor of the referent of another noun.|my
+
**superlative (SUP)
|-GRO|Group|Nouns denoting groupings of people or objects|
+
***absolute superlative (SUPA)
|-HAB|Habitual|An imperfective aspect that expresses the occurrence of an event or state as characteristic of a period of time.|I used to walk.
+
***comparative superlative (SUPR)
|-IA|Adjunct of an inflection|An optional constituent of an inflectional phrase.|
+
*[[distribution]] (DIS)
|-IB|Inflectional Phrase (Intermediate Projection)||
+
**after (AFT)
|-IC|Complement of an inflection|A phrasal or clausal category which is selected (subcategorized) by the head of a phrase.|
+
**before (BEF)
|-ICP|Inceptive||I started eating.
+
**end (END)
|-ILL|Illative|A case that expresses motion into or direction toward the referent of the noun it marks.|
+
**free (FRE)
|-IMP|Imperative|A grammatical mood that expresses direct commands or requests. It is also used to signal a prohibition, permission or any other kind of exhortation.|
+
**front (FRT)
|-IND|Indicative||
+
**immediately after (IAFT)
|-INE|Inessive|A case that expresses a location within the referent of the noun it marks.|
+
**immediately before (IBEF)
|-INF|Infinitive|The base form of a verb generally unmarked for inflectional categories.|be (to be)
+
**middle (MID)
|-INJ|Injunctive||
+
*[[information structure]] (IST)
|-INS|Instrumental|A case indicating that the referent of the noun it marks is the means of the accomplishment of the action expressed by the clause.|
+
**focus (FOC)
|-INX|Infix|Affix that is inserted within a root or stem.|
+
**rheme (RHE)
|-IP|Inflectional Phrase (Maximal Projection)||
+
**theme (THE)
|-IPR|Interrogative pronoun|A pro-form that is used in questions to stand for the item questioned.|who
+
*[[gender]] (GEN)
|-IS|Specifier of an inflection||
+
**feminine (FEM)
|-ITE|Iterative|Aspect that expresses the repetition of an event or state.|I ate it again and again.
+
**masculine (MCL)
|-ITJ|Interjection||hello
+
**neuter (NEU)
|-ITST|Indirect transitive|A verb which takes a subject and a single indirect object|
+
**common (COM)
|-JA|Adjunct of an adjective|An optional constituent of an adjective phrase.|
+
**variable (VAR)
|-JB|Adjective Phrase (Intermediate Projection)||
+
*[[lexical category]] (LEX)
|-JC|Complement of an adjective|A phrasal or clausal category which is selected (subcategorized) by the head of a phrase.|
+
**[[adjective]] (J)
|-JP|Adjective Phrase (Maximal Projection)||
+
**[[adposition]] (P)
|-JS|Specifier of an adjective||
+
**[[adverb]] (A)
|-LAT|Lative|A case that expresses motion up to the location of, or as far as the referent of the noun it marks.|
+
**[[affix]] (F)
|-LCT|Location|Nouns denoting spatial position|
+
**[[conjunction]] (C)
|-LEX|LEXICAL STATUS||
+
**[[determiner]] (D)
|-LOC|Locative|A case that expresses location at the referent of the noun it marks.|
+
**[[inflection]] (I)
|-MAF|Masculine and feminine|Variable gender|un après-midi = une après-midi
+
**[[noun]] (N)
|-MCL|Masculine|Includes most words that refer to males.|he
+
**[[numeral]] (U)
|-MID|Middle voice|A voice that indicates that the subject is the actor and acts upon himself or herself reflexively, or for his or her own benefit.|
+
**[[noun|proper noun]] (E)
|-MOF|Masculine or feminine|Common gender|le pianiste x la pianiste
+
**[[pronoun]] (R)
|-MOO|MOOD|A verb mood typically used in dependent clauses to express wishes, commands, emotion, possibility, judgment, opinion, necessity, or statements that are contrary to fact at present.|
+
**[[verb]] (V)
|-MOT|Motion|Verbs of walking, flying, swimming|
+
**other (O)
|-MOV|Modal verb||can
+
*[[lexical structure]] (LST)
|-MTV|Motive|Nouns denoting goals|
+
**subword (SBW)
|-MTW|Multiword expression|Any string comprising more than a word|United States of America
+
**simple word (WRD)
|-MUL|Multiplicative numeral|A numeral that expresses how many fold or how many times.|
+
***abbreviation (ABB) and single-word contraction
|-NA|Adjunct of a noun|An optional constituent of a noun phrase.|
+
***clitic (CLI)
|-NABS|Non-abstract (concrete)||
+
**multiword expression (MTW)
|-NANM|Inanimate|Indicates an inanimate reference|it
+
***acronym (ACR) and initialism
|-NB|Nominal Phrase (Intermediate Projection)||
+
***multiple-word contraction (CTT) and blend
|-NC|Complement of a noun|A phrasal or clausal category which is selected (subcategorized) by the head of a phrase.|
+
*[[modality]] (MOD)
|-NEU|Neuter|Includes mostly words that do not refer to males or females.|it
+
**realis (REA)
|-NEV|Natural events|Nouns denoting natural events|
+
**irrealis (NREA)
|-NOB|Natural objects|Nouns denoting natural objects (not man-made)|
+
**alethic (ALE)
|-NOM|Nominative|Indicates the subject of a finite verb.|I (in I saw him)
+
**deontic (DEO)
|-NOU|Noun||beauty
+
***comissive (CMS)
|-NP|Nominal Phrase (Maximal Projection)||
+
***directive (DRT)
|-NPFC|Imperfective|An event in the process of unfolding (often a repeated or habitual event)|I was swimming.
+
***volitive (VLT)
|-NPR|Indefinite pronoun|An indefinite pronoun is a pronoun that belongs to a class whose members indicate indefinite reference.|anybody, one, somebody
+
**epistemic (EPI)
|-NS|Specifier of a noun||
+
***evidentiality (EVI)
|-NTST|Intransitive|A verb that does not take an object|fall
+
***judgment (JDG)
|-NUM|NUMBER|A grammatical category of nouns, pronouns, and adjective and verb agreement that expresses count distinctions.|
+
*[[mood]] (MOO)
|-ONB|Ordinal numeral|A numeral belonging to a class whose members designate positions in a sequence.|second
+
**none (non-finite verb forms) (VBL)
|-OPT|Optative|A grammatical mood that indicates a wish or hope.|
+
***gerund (GER)
|-PA|Adjunct of a preposition|An optional constituent of a prepositional phrase.|
+
***gerundive (GDV)
|-PAR|Partitive|A case that expresses the partial nature of the referent of the noun it marks, as opposed to expressing the whole unit or class of which the referent is a part.|
+
***infinitive (INF)
|-PAU|Paucal||
+
***participle (PTP)
|-PAV|Passive voice|When the subject is the patient, target or undergoer of the action.|
+
***supine (SPN)
|-PB|Prepositional Phrase (Intermediate Projection)||
+
**assumptive (AUM)
|-PC|Complement of a preposition|A phrasal or clausal category which is selected (subcategorized) by the head of a phrase.|
+
**causative (CAU)
|-PCP|Perception|Verbs of seeing, hearing, feeling|
+
**conditional (CON)
|-PER|PERSON|A deictic reference to a participant in an event, such as the speaker, the addressee, or others.|
+
**declarative (DEC)
|-PFC|Perfective|A single event conceived as a unit|I swam.
+
**deductive (DED)
|-PFX|Prefix|Affix that is joined before a root or stem.|un
+
**deliberative (DLB)
|-PGS|Progressive|Continuous aspect that expresses processes, not states.|I am eating.
+
**dubitative (DUB)
|-PHE |Natural phenomena|Nouns denoting natural phenomena|
+
**hypothetical (HYP)
|-PLA|Plant|Nouns denoting plants|
+
**imperative (IMP)
|-PLR|Plural|Number that expresses reference to a quantity greater than that expressed by the largest specific number category in a language, such as "more than one" in English, and "more than two" in some other languages.|they
+
**imprecative (IPC)
|-PLT|Prolative|A case that expresses motion along or by the referent of the noun it marks.|
+
**indicative (IND)
|-PON|Possession|Nouns denoting possession and transfer of possession|
+
**inferential (INFR)
|-POS|PART OF SPEECH||
+
**interrogative (INT)
|-POV|Possession|Verbs of buying, selling, owning|
+
**jussive (JUS)
|-PP|Prepositional Phrase (Maximal Projection)||
+
**obligative (OBM)
|-PPL|Prepositional||
+
**optative (OPT)
|-PPN|Proper noun|Noun that is the name of a specific individual, place, or object.|Geneva
+
**permissive (PMS)
|-PPR|Personal pronoun|A personal pronoun is a pronoun that expresses a distinction of person deixis.|I, he, she, it, we
+
**potential (POT)
|-PPS|Postposition|Adposition that occurs after its complement.|
+
**precative (PCT)
|-PPT|Prospective||I am about to eat.
+
**prohibitive (PHB)
|-PRE|Preposition|Adposition that occurs before its complement.|against
+
**speculative (SPT)
|-PRO|Natural processes|Nouns denoting natural processes|
+
**subjunctive (SUB)
|-PRS|Person|Nouns denoting people|
+
*[[morphology]] (MOR)
|-PS|Specififer in Prepositional Phrase||
+
**affix (AFF)
|-PST|Positive||
+
***inflectional affix (IAX)
|-PTC|Particle|A word that does not belong to one of the main classes of words is invariable in form, and typically has grammatical or pragmatic meaning.|to
+
***derivational affix (DAX)
|-PTP|Participle|A lexical item, derived from a verb, that has some of the characteristics and functions of both verbs and adjectives.|done
+
**base form (BF)
|-QDR|Quadrual||
+
***root (ROO)
|-QTT|Quantity|Nouns denoting quantities and units of measure|
+
***stem (STE)
|-QUA|Quantifier|A determiner that expresses a referent's definite or indefinite number or amount.|every
+
**word form (WFO)
|-RCAS|Receives Case|Used in case agreement.|
+
**alternative form (ALT)
|-REL|Relation|Nouns denoting relations between people or things or ideas|
+
***alternative form 1 (ALT1)
|-RGEN|Receives Gender|Used in gender agreement.|
+
***alternative form 2 (ALT2)
|-RNUM|Receives Number|Used in number agreement.|
+
***alternative form 3 (ALT3)
|-RPER|Receives Person|Used in person agreement.|
+
***short or weak form (SHO)
|-RPR|Relative pronoun|A relative pronoun is a pronoun that marks a relative clause, functions grammatically within the relative clause, and is coreferential to the word modified by the relative clause.|The man [who] comes next
+
***long or strong form (STR)
|-RT0|Past reference tense||had been (I had been here)
+
*[[number]] (NUM)
|-RT1|Present reference tense||
+
**singular (SNG)
|-RT2|Future reference tense|A relative tense tense that refers to a temporal reference point located in the future.|would had been (I would had been there)
+
***singulare tantum (SNGT)
|-RTE|REFERENCE TENSE|A temporal linguistic quality expressing the time at, during, or over which a state or action denoted by a verb occurs with reference to another state or action.|
+
**plural (PLR)
|-SBS|Substance|Nouns denoting substances|
+
***dual (DUA)
|-SBW|Subword (bound morpheme)|Any string smaller than a word (a root, a stem, etc)|bab (baby)
+
***trial (TRI)
|-SCJ|Subordinanting conjunction|A conjunction that links constructions by making one of them a constituent of another.|if
+
***quadrual (QDR)
|-SEM|SEMANTIC FEATURES||
+
***paucal (PAU)
|-SFX|Suffix|Affix that is attached to the end of a root or stem.|s
+
***multal (MUL)
|-SHA|Shape|Nouns denoting two and three dimensional shapes|
+
***plurale tantum (PLRT)
|-SNG|Singular|Number that refers to one member of a designated class.|he
+
**invariant (INV)
|-SOC|Social|Verbs of political and social activities and events|
+
*[[part of speech]] (POS)
|-SP|Sentence Phrase (Maximal Projection)||
+
**[[adjective]]s (J)
|-SPE|Superessive|A case that expresses location on the referent of the noun it marks.|
+
***adjective (ADJ)
|-SPR|Posessive pronoun|A possessive pronoun is a pronoun that expresses ownership and relationships like ownership, such as kinship, and other forms of association.|my, mine
+
***participle (PTL)
|-STA|State|Nouns denoting stable states of affairs|
+
**[[adposition]] (P)
|-STT|Stative|Verbs of being, having, spatial relations|
+
***circumposition (CIR)
|-SUB|Subjunctive|A verb mood typically used in dependent clauses to express wishes, commands, emotion, possibility, judgment, opinion, necessity, or statements that are contrary to fact at present|
+
***postposition (PPS)
|-SUP|Superlative|An adjective that compares the quality with many or all others of its kind|best
+
***preposition (PRE)
|-SYN|SYNTACTIC ROLES||
+
**[[adverb]] (A)
|-TER|Terminative||I finished eating.
+
***specifier adverb (SAV)
|-TIM|Time|Nouns denoting time and temporal relations|
+
***adjunct adverb (AAV)
|-TLT|Translative|A case indicating that the referent of the noun, or the quality of the adjective, that it marks is the result of a process of change.|
+
***conjunct (CJT)
|-TRA|TRANSITIVITY|A property of verbs that relates to whether a verb can take direct objects.|
+
***disjunct (DJT)
|-TRI|Trial|A number that refers to three members of the designated class.|
+
**[[affix]] (F)
|-TST|Direct transitive|A verb which takes a subject and a single direct object|kiss
+
***circumfix (CCX)
|-TTST|Tritransitive|A verb which takes a subject and three objects.|trade
+
***infix (IFX)
|-VA|Adjunct of a verb|An optional constituent of a verbal phrase.|
+
***prefix (PFX)
|-VAL|VALENCY|Verb valency or valence refers to the number of arguments controlled by a verbal predicate.|
+
***suffix (SFX)
|-VAL0|Avalent|An avalent verb takes no arguments|rain
+
**[[conjunction]] (C)
|-VAL1|Monovalent|A monovalent verb takes one argument|sleep
+
***coordinating conjunction (COO)
|-VAL2|Divalent|A verb which takes two arguments|eat
+
****correlative conjunction (CRC)
|-VAL3|Trivalent|A trivalent verb takes three arguments|give
+
***subordinating conjunction (SCJ)
|-VAL4|Tetravalent|A trivalent verb takes four arguments|
+
****adverbializer (AVR)
|-VB|Verbal Phrase (Intermediate Projection)||
+
****complementizer (CMR)
|-VC|Complement of a verb|A phrasal or clausal category which is selected (subcategorized) by the head of a phrase.|
+
****relativizer (RVZ)
|-VER|Verb||buy
+
**[[determiner]] (D)
|-VOC|Vocative|A case that marks a noun whose referent is being addressed.|
+
***article (ART)
|-VOI|VOICE|The voice (also called diathesis) of a verb describes the relationship between the action (or state) that the verb expresses and the participants identified by its arguments (subject, object, etc.).|
+
***demonstrative determiner (DEM)
|-VP|Verbal phrase||
+
***possessive determiner (POD)
|-VS|Specifier of a verb||
+
***quantifier (QUA)
|-WEA|Weather|Verbs of raining, snowing, thawing, thundering|
+
**inflection (I)
|-WRD|Regular word||
+
***auxiliary verb (AUX)
|}
+
****modal verb (MOV)
 +
**[[noun]] (N)
 +
***common noun (NOU)
 +
**[[noun|proper noun]] (E)
 +
***proper noun (PPN)
 +
**[[numeral]] (U)
 +
***DIGIT (digits)
 +
****DOZEN (used to deal with dozens)
 +
****HUNDRED (used to deal with hundreds)
 +
***cardinal numeral (CDN)
 +
***distributive numeral (DTN)
 +
***partitive numeral (PTN)
 +
***multiplicative numeral (MLN)
 +
***ordinal numeral (ORD)
 +
**[[pronoun]] (R)
 +
***demonstrative pronoun (DEP)
 +
***dummy pronoun (DUM)
 +
***emphatic pronoun (EPR)
 +
***indefinite pronoun (NPR)
 +
***interrogative pronoun (IPR)
 +
***personal pronoun (PPR)
 +
***possessive pronoun (SPR)
 +
***reciprocal pronoun (CPR)
 +
***reflexive pronoun (FPR)
 +
***relative pronoun (RPR)
 +
**[[verb]] (V)
 +
***full verb (VER)
 +
***copula (COP)
 +
**other (O)
 +
***classifier (CLA)
 +
***interjection (ITJ)
 +
***particle (PTC)
 +
***punctuation (PUT)
 +
****blank (BLK)
 +
****<nowiki>' </nowiki>(APOSTROPHE)
 +
****<nowiki>- </nowiki>(HYPHEN)
 +
****<nowiki>! </nowiki>(EMARK)
 +
****<nowiki>" </nowiki>(QUOTE)
 +
****<nowiki># </nowiki>(HASH)
 +
****<nowiki>$ </nowiki>(DOLLAR)
 +
****<nowiki>% </nowiki>(PERCENTAGE)
 +
****<nowiki>& </nowiki>(AMPERSAND)
 +
****<nowiki>( </nowiki>(OPARENTHESIS)
 +
****<nowiki>) </nowiki>(CPARENTHESIS)
 +
****<nowiki>* </nowiki>(ASTERISK)
 +
****<nowiki>, </nowiki>(COMMA)
 +
****<nowiki>. </nowiki>(PERIOD)
 +
****<nowiki>/ </nowiki>(FSLASH)
 +
****<nowiki>: </nowiki>(COLON)
 +
****<nowiki>; </nowiki>(SEMICOLON)
 +
****<nowiki>? </nowiki>(QMARK)
 +
****<nowiki>[ </nowiki>(OSBRACKET)
 +
****<nowiki>\ </nowiki>(BSLASH)
 +
****<nowiki>] </nowiki>(CSBRACKET)
 +
****<nowiki>{ </nowiki>(OCBRACE)
 +
****<nowiki>} </nowiki>(CCBRACE)
 +
****<nowiki>€ </nowiki>(EURO)
 +
****<nowiki>+ </nowiki>(PLUS)
 +
****<nowiki>< </nowiki>(LTHAN)
 +
****<nowiki>= </nowiki>(EQUAL)
 +
****<nowiki>> </nowiki>(GTHAN)
 +
*[[person]] (PER)
 +
**impersonal (NPER)
 +
**first person (1PER)
 +
***first person singular (1PS)
 +
***first person plural (1PP)
 +
****123PP (me, you and others)
 +
****13PP (me and others)
 +
**second person (2PER)
 +
***second person singular (2PS)
 +
***second person plural (2PP)
 +
**third person (3PER)
 +
***third person singular (3PS)
 +
***third person plural (3PP)
 +
*[[polarity]] (POL)
 +
**affirmative (AFM)
 +
**negative (NEG)
 +
*[[register]] (REG)
 +
**archaism (ARC)
 +
**colloquialism (CLQ)
 +
**dialect (DIA)
 +
**jargon (JGN)
 +
**literary (LIT)
 +
**pejorative (PEJ)
 +
**slang (SLG)
 +
**taboo (TAB)
 +
*[[social deixis]] (SOD)
 +
**solidarity (SOL)
 +
***familiar (FAM)
 +
***intimate (ITM)
 +
***polite (PLN)
 +
**status (STS)
 +
***equivalent (EVL)
 +
***inferior (IFS)
 +
***reverential (REV)
 +
***superior (SPS)
 +
*[[syntactic roles]] (SYN)
 +
**adjunct (XA)
 +
***adjunct to the head of an adjective phrase (JA)
 +
***adjunct to the head of an adverbial phrase (AA)
 +
***adjunct to the head of a complementizer phrase (CA)
 +
***adjunct to the head of a determiner phrase (DA)
 +
***adjunct to the head of an inflectional phrase (IA)
 +
***adjunct to the head of a nominal phrase (NA)
 +
***adjunct to the head of a prepositional phrase (PA)
 +
***adjunct to the head of a verbal phrase (VA)
 +
**complement (XC)
 +
***complement of the head of an adjective phrase (JC)
 +
***complement of the head of an adverbial phrase (AC)
 +
***complement of the head of a complementizer phrase (CC)
 +
***complement of the head of a determiner phrase (DC)
 +
***complement of the head of an inflectional phrase (IC)
 +
***complement of the head of a nominal phrase (NC)
 +
***complement of the head of a prepositional phrase (PC)
 +
***complement of the head of a verbal phrase (VC)
 +
**head (XH)
 +
***head of an adverbial phrase (AH)
 +
***head of an adjective phrase (JH)
 +
***head of a complementizer phrase (CH)
 +
***head of a determiner phrase (DH)
 +
***head of an inflectional phrase (IH)
 +
***head of a nominal phrase (NH)
 +
***head of a prepositional phrase (PH)
 +
***head of a verbal phrase (VH)
 +
**specifier (XS)
 +
***specifier of the head of an adjective phrase(JS)
 +
***specifier of the head of an adverbial phrase (AS)
 +
***specifier of the head of a complementizer phrase (CS)
 +
***specifier of the head of a determiner phrase(DS)
 +
***specifier of the head of an inflectional phrase (IS)
 +
***specifier of the head of a nominal phrase (NS)
 +
***specifier of the head of a prepositional phrase (PS)
 +
***specifier of the head of a verbal phrase (VS)
 +
**maximal projection (XP)
 +
***adjective phrase (JP)
 +
***adverbial phrase (AP)
 +
***complementizer phrase (CP)
 +
***determiner phrase (DP)
 +
***inflectional phrase (IP)
 +
***nominal phrase (NP)
 +
***prepositional phrase (PP)
 +
***verbal phrase (VP)
 +
**intermediate projection (XB)
 +
***adverbial phrase (AB)
 +
***adjective phrase (JB)
 +
***complementizer phrase (CB)
 +
***determiner phrase (DB)
 +
***inflectional phrase (IB)
 +
***nominal phrase (NB)
 +
***prepositional phrase (PB)
 +
***verbal phrase (VB)
 +
**trace (TRACE)
 +
*[[tense]] (TNS)
 +
**absolute tense (ATE)
 +
***past (PAS)
 +
***present (PRS)
 +
****preterit (PTR)
 +
****hesternal past tense (HEP)
 +
****prehesternal past tense (PEP)
 +
****hodiernal past tense (HOP)
 +
****prehodiernal past tense (POP)
 +
****immediate past tense (IPT)
 +
****nonrecent past tense (NRCP)
 +
****recent past tense (RCP)
 +
****nonremote past tense (NRMP)
 +
****remote past tense (RMP)
 +
***future (FUT)
 +
****near future (FUN)
 +
****remote future (FUR)
 +
***nonpast (NPAS)
 +
***nonfuture (NFUT)
 +
***still (STL)
 +
***not-yet (NYET)
 +
**relative tense (RTE)
 +
***relative past (RPT)
 +
***relative nonpast (NRPT)
 +
***relative present (RPS)
 +
***relative future (RFT)
 +
***relative nonfuture (NRFT)
 +
*[[transitivity]] (TRA)
 +
**no transitivity (NTRA) (linking verb)
 +
**transitive (TST)
 +
***direct transitive (TSTD)
 +
***indirect transitive (TSTI)
 +
***ditransitive (TST2)
 +
***tritransitive (TST3)
 +
**intransitive (NTST)
 +
***unergative (NERG)
 +
***unaccusative (NACC)
 +
*[[Universal Attribute]]s (att)
 +
**animacy attributes (ANIA)
 +
**aspect attributes (ASPA)
 +
**degree attributes (DEGA)
 +
**emotion attributes (FEEL)
 +
**figure of speech attributes (FIGA)
 +
**gender attributes (GENA)
 +
**information structure attributes (ISTA)
 +
**lexical attributes (LEXA)
 +
**manner attributes (HOW)
 +
**modality attributes (MODA)
 +
**person attributes (PERA)
 +
**polarity attributes (POLA)
 +
**place attributes (WHERE)
 +
**quantification attributes (QUAA)
 +
**register attributes (REGA)
 +
**social deixis attributes (SODA)
 +
**specification attributes (WHICH)
 +
**syntactic structures (SYNA)
 +
**time attributes (WHEN)
 +
**voice attribute (VOIA)
 +
*[[Universal Relations]] (rel)
 +
*[[Universal Words]] (SEM)
 +
**Adjective concepts
 +
***age (AGE)
 +
***colour (COR)
 +
***dimension (DMS)
 +
***human propensity (HPP)
 +
***physical property (PHY)
 +
***speed (SPD)
 +
***value (VLE)
 +
***other adjectives (JJJ)
 +
**Adverbial concepts
 +
***degree (DGR)
 +
***manner (MAN)
 +
***place (PLE)
 +
***time (TME)
 +
***other adverbs (AAA)
 +
**Nominal concepts
 +
***act or action (ACT)
 +
***animal (ANL)
 +
***artifact (ARF) (man-made objects)
 +
***attribute (ATR) (of people and objects)
 +
***body part (BON)
 +
***cognitive processes and contents (CGN)
 +
***communicative processes and contents (CMN)
 +
***feelings and emotions (FEE)
 +
***foods and drinks (FOO)
 +
***groupings of people or objects (GRO)
 +
***location (LCT) (spatial position)
 +
***motive (MTV) (goals)
 +
***natural events (NEV)
 +
***natural objects (NOB) (non man-made objects)
 +
***natural phenomena (PHE)
 +
***plant (PLA)
 +
***possession or transfer of possession (PON)
 +
***natural process (NAT)
 +
***person (HUM)
 +
***quantities and units of measure (QTT)
 +
***relations between people or things or ideas (REL)
 +
***substance (SBS)
 +
***shape (SHA) (two or three-dimensional shapes)
 +
***state (STA) (stable states of affairs)
 +
***time and temporal relations (TIM)
 +
**Verbal concepts
 +
***body action (BOV)
 +
***cognitive verb (CGV)
 +
***change (CHA)
 +
***communication verb (CMV)
 +
***competition (CPT)
 +
***creation (CRE)
 +
***consumption (CSM)
 +
***contact (CTC)
 +
***emotion (EMO)
 +
***motion (MOT)
 +
***perception (PCP)
 +
***possession verb (POV)
 +
***social (SOC)
 +
***stative (STT)
 +
***weather (WEA)
 +
*[[valency]] (VAL)
 +
**avalent (VAL0)
 +
**monovalent (VAL1)
 +
**divalent (VAL2)
 +
**trivalent (VAL3)
 +
**tetravalent (VAL4)
 +
*[[voice]] (VOI)
 +
**active voice (ACV)
 +
**middle voice (MIV)
 +
**passive voice (PSV)
 +
*other
 +
**System-defined values
 +
***CHEAD (beginning of a scope)
 +
***CTAIL (end of a scope)
 +
***DIGIT (digits)
 +
***SCOPE (scope)
 +
***SHEAD (beginning of the sentence)
 +
***STAIL (end of the sentence)
 +
***TEMP (temporary entry - not found in the dictionary)
 +
**Grammar-related attributes
 +
***FLX (inflectional rules)
 +
***FRA (subcategorization frame)
 +
***GOV (subcategorization rules)
 +
***PAR (inflectional paradigm)
 +
***SFR (semantic frame)
 +
}}

Latest revision as of 13:03, 19 May 2015

The set of features in a UNL-driven dictionary depends on the structure of the natural language and may vary a lot. However, in order to better standardize lexical resources inside the UNL framework, the UNDL Foundation recommends the adoption of the following tags for some specific and pervasive grammatical phenomena. Several of those linguistic constants have been already proposed to the Data Category Registry (ISO 12620), and represent widely accepted linguistic concepts. Our main intention here is just to provide a harmonized system to be shared by the UNL community so as to make dictionaries as easily understandable and exchangeable as possible.

When to use the UNDLF Tagset

The UNDLF Tagset is required for providing lexical resources (dictionary entries and grammar rules) in the UNLarium framework. Indeed, the whole environment has been already prepared to accept only the tags here presented. In most cases, the use of tags is rather unnoticeable and effortless, since users are supposed to make higher-level choices ("adjective", for instance) which will be internally represented through the corresponding authorized labels ("ADJ"). However, in several circumstances, as when creating inflectional paradigms or subcategorization frames, users are expected to address more fine-grained linguistic phenomena that may require a specialized metalanguage. That's exactly the purpose of this tagset: to provide the technical means for describing any linguistic behaviour. And it should do that in a strongly standardised way, i.e., so that others could easily understand and exploit the data for their own benefit.

General Guidelines

In order to define the tags to be used in the UNDLF Tagset, the following premises were adopted:

  • Tags should be as comprehensive as possible (i.e., they should cover all widely accepted linguistic concepts)
  • Tags should be as few as possible (i.e., they should avoid redundancy)
  • Tags should be as short as possible (i.e., they should fit in a three-character string)
  • Tags should be as mnemonic as possible (i.e., they should be provided through English acronyms or abbreviations)
  • Tags should constitute a taxonomic hierarchy (so that upper level values could be inferred from the lower ones).

Additionally, the following conventions were adopted:

  • Tags are written in upper case letters;
  • Negation is represented by prefixation with "N-" (past = PAS, nonpast = NPAS).

We have tried to stick to the standard abbreviations proposed by the Leipzig Glossing Rules and by David Crystal in A dictionary of Linguistics and Phonetics (2008), as much as they comply with the rules above. The resulting set of tags, which is still subject to additions and revisions, is presented below. For the time being, the definitions and examples have been extracted out of the Glossary of Linguistic Terms (Loos et alii), available at SIL International. The tags are expected to migrate to an on-line environment, still under construction, where accredited linguists will have the opportunity to enhance and to improve this repertoire.

Tree of attributes and values

The hierarchy of tags is depicted in the tree below. The topmost level represents the attributes of which the tags are a value. Lower positions subsume upper levels (for instance: progressive is a value of continuative, which is a value of imperfective, which is a value of the attribute aspect), but are not mandatory, as they can be too specialized ("go" is just a verb, and not any of the subcategories of verb). In any case, natural language phenomena should be classified as deep as possible in the tagset structure ("un-" should be classified as a prefix, rather than as an affix).

List of tags in alphabetical order

Software