Subcategorization frame
In the UNL framework, Subcategorization Frames are the number and types of syntactic arguments that co-occurs with the lemma in a sentence.
Contents |
When to use subcategorization frames
Subcategorization frames are mandatory for words that take one or more syntactic argument, including:
- monovalent verbs ('sleep', 'rain')
- monovalent adverbs ('well', 'very')
- monovalent nouns ('arrival', 'construction')
- divalent verbs ('kill', 'kiss', 'depend')
- divalent adjectives ('loyal', 'interested')
- divalent prepositions and adverbs ('after', 'in', 'near', 'instead')
- trivalent verbs ('give', 'turn')
- trivalent prepositions ('between')
- etc.
When not to use subcategorization frames
Subcategorization frames may not be used in case of words that take zero argument:
- avalent nouns ('table', 'computer')
- avalent adverbs ('here', 'now')
Arguments and adjuncts
In the UNL framework, the subcategorization frame should be as small as possible, and should include only core arguments, in opposition to adjuncts.
Syntax of generation rules
Subcategorization frames should be presented as a list of syntactic roles separated by semicolons. Each syntactic role must have the following format:
<SYNTACTIC ROLE> ":=" "(" <SYNTACTIC FEATURES> ")" [, "(" <SYNTACTIC FEATURES> ")" ]* ";"
where
- <SYNTACTIC ROLE> = one of the three pre-defined syntactic roles (see below)
- <SYNTACTIC FEATURES> = the list of features required by the lemma
- [ ] = optional
- “ “ = constant
- * = to be repeated zero or more times
Syntactic Roles
There are only three different types of syntactic roles:
Tag | Syntactic Role | Description |
---|---|---|
SPEC | specifier (external argument) | subject |
COMP | complement (internal argument) | direct object, indirect object |
ADJ | adjunct |
Syntactic Features
The syntactic features must indicate:
- the selection for the syntactic category of the arguments (c-selection), if any
- NP = Noun phrase
- VP = Verbal phrase
- JP = Adjective phrase
- AP = Adverbial phrase
- PP = Prepositional phrase
- SP = Sentence
- the syntactic case marking, if any
- NOM = Nominative
- ACC = Accusative
- DAT = Dative
- ABL = Ablative
- INS = Instrumental
- LOC = Locative
- the agreement, if any
- >NUM = Assigns number
- <NUM = Receives number
- >GEN = Assigns gender
- <GEN = Receives gender
- >PER = Assigns person
- <PER = Receives person
- the government, if any
- the preposition required by the lemma
Other symbols
[Square brackets] may be used to indicate optional elements: a[b]c = ac, abc
{braces} may be used to indicate alternative elements: a{b,c}d = abd, acd
Examples
VERBS
- INTRANSITIVE ("sleep")
- SPEC:=(NP,NOM,>NUM,>PER);
- COPULA ("be")
- SPEC:=(NP,NOM,>NUM,>PER); COMP:=({NP,JP},NOM,>NUM,>PER);
- DIRECT TRANSITIVE ("kill")
- SPEC:=(NP,NOM,>NUM,>PER); COMP:=(NP,ACC);
- INDIRECT TRANSITIVE ("depend")
- SPEC:=(NP,NOM,>NUM,>PER); COMP:=(PP,ACC,"on");
- DITRANSITIVE ("give")
- SPEC:=(NP,NOM,>NUM,>PER); COMP:=(NP,ACC); COMP:=(PP,DAT,"to");
ADJECTIVES
- LOYAL (TO)
- COMP:=(PP,"to");
- INTERESTED (IN)
- COMP:=(PP,"in");
PREPOSITIONS
- NEAR(TO)
- SPEC:=(NP); COMP:=(PP,"to");
- IN
- SPEC:=({NP,VP}); COMP:=(NP);