326 lines
13 KiB
Coq
326 lines
13 KiB
Coq
(** * Cochabamba Quechua Plurals
|
||
|
||
Below is a morphological fragment of in Lexical Proof Morphology of an
|
||
overabundant plural pattern found in Cochabamba Quechua.
|
||
|
||
In some dialects of Cochabamba Quechua the borrowed Spanish plural -s
|
||
occurs on all nominal stems that end in a vowel. There is also a
|
||
native plural -kuna, that occurs on nominal stems generally. These
|
||
plural may also co-occur in varying order on the stem. So for a vowel
|
||
stem, like warmi, 'woman', warmi-s, warmi-kuna, warmi-s-kuna,
|
||
warmi-kuna-s are all predicted to occur, though not with equal
|
||
likelihood, necessarily. A stem that ends in a consonant, like sipas,
|
||
'young woman', is predicted to have the plural forms sipaskuna and
|
||
sipaskunas for some speakers. There is also a possible semantic
|
||
constraint on the distribution of the Spanish plural, where the -kuna
|
||
plural is more appropriate for contrastive subjects. Of course, there
|
||
are also other dialectal patterns and speaker variation. The fragment
|
||
below captures one observable pattern. *)
|
||
|
||
Require Import Coq.Lists.List.
|
||
Require Import Coq.Strings.String.
|
||
Import ListNotations.
|
||
|
||
Open Scope type_scope.
|
||
Open Scope string_scope.
|
||
|
||
(** * Initial axioms of the fragment
|
||
|
||
Below I introduce the basic lexemes and categories of the
|
||
morphological fragment.
|
||
|
||
** Lexemes
|
||
|
||
First I define some basic lexemes. Lexemes have no inherent
|
||
properties. They serve a role similar to foriegn keys in a relational
|
||
database. They are a key for retrieving and defining collections of
|
||
information. For instance ( LEXEME_1, x ), ( LEXEME_1, y ) puts both x
|
||
and y in the same collection, where the collection is defined in terms
|
||
of the second elements of tuples where the first element is the same
|
||
lexeme.
|
||
|
||
The lexeme WARMI corresponds to a nominal with a stem ending in a
|
||
vowel. The lexeme SIPAS corresponds to a lexeme ending in a
|
||
consonant. *)
|
||
|
||
Inductive lexeme : Set :=
|
||
| WARMI : lexeme (* woman *)
|
||
| SIPAS : lexeme. (* young woman *)
|
||
|
||
(** ** Morphological categories
|
||
|
||
The type matom is for atoms used in m-category names or mcats,
|
||
which I compose like lists. Mcats are morphological categories, which
|
||
categorize observable analogical patterns in morphological form. For
|
||
instance, though warmis, warmikuna, warmiskuna and warmikunas
|
||
correspond to two syntactic categories, one for basic plural and one
|
||
for a contrastive subject, but there are, at minimum, four
|
||
morphological categories, one for each form. In practice, there are
|
||
more categories to capture the "morphotactics" or the allowable
|
||
patterns of affix coocurence within words of the system and to allow
|
||
for generalizations when mapping between syntactic categories.
|
||
|
||
In fact, what one "sees" in this system are not the categories,
|
||
themselves, which are conceptually atomic but a means of referring to
|
||
them with names. Think of the name John Smith as a category name and
|
||
Smith as analogous to an matom. Just as the last name Smith can belong
|
||
to an entire family or even multiple people in multiple families, each
|
||
person with Smith in their name is an individual and could,
|
||
potentially be named something else. Likewise, having Smith as a last
|
||
name, one of the most frequent last names in the English language,
|
||
does not entail family membership. Despite this, the system of naming
|
||
children using their parents' last names does simplify the naming of
|
||
children. Therefore if it were a rule that every child needed to be
|
||
named as a composition of the last name of a particular parent and
|
||
some individuated name, the rules of naming are greatly simplified,
|
||
even if nothing can be said to be entailed about the properties of
|
||
that individual by having a particular last name. The category naming
|
||
scheme used here, where names are lists of matoms allows for
|
||
flexibility and simplification in naming categories using rules but,
|
||
at least at this time, no rule applies because it has a particular
|
||
sub-element or sub-list, i.e.\ nothing is entailed about the category
|
||
just because a sub-element of its name is one thing or another. This
|
||
is a fundamental difference between this scheme and a feature system,
|
||
for instance. *)
|
||
|
||
Inductive matom : Set :=
|
||
| nbase : matom
|
||
| mplural : matom
|
||
| s : matom
|
||
| kuna : matom
|
||
| kbase : matom
|
||
| sbase : matom
|
||
| ta : matom.
|
||
|
||
(** As stated above, mcat is a list of matoms. *)
|
||
|
||
Definition mcat := list matom.
|
||
|
||
** Mcats for the fragment *)
|
||
|
||
(* [ nbase ] is a basic noun stem. [ kbase ] is a stem that can take
|
||
a -kuna affix. [ sbase ] is a stem that can take a -s affix. [
|
||
kuna ] is any form that contains -kuna. [ mplural ] is any form
|
||
with a plural affix. [ s , nbase ] is a basic stem with only an -s
|
||
affix. [ kuna, nbase ] is a basic stem with only a -kuna affix.
|
||
The other two mcats contain both suffixes in differing order.
|
||
|
||
The kuna forms are hypothesized to have a different distribution than
|
||
the -s only form. They are (possibly) more compatible with focus
|
||
constructions in subject position.
|
||
|
||
A take away for the above is that nearly every form type has its own
|
||
category, at least in this analysis of this phenomenon. Labeling form
|
||
types is not the only role of these categories. Some categories, such
|
||
as [ sbase ] and [ kbase ] are only directly relevant to
|
||
morphotactics. Other categories, such as [ kuna ] are more relevant to
|
||
the interface with the syntactic paradigm. *)
|
||
|
||
(** ** Morphological forms
|
||
|
||
An mform, or morphological form, is a string *)
|
||
|
||
Definition mform := string.
|
||
|
||
(** Below are some string functions to handle suffixation. *)
|
||
|
||
Definition suffix_s (stem:mform) : mform :=
|
||
stem ++ "s".
|
||
|
||
Definition suffix_kuna (stem:mform) : mform :=
|
||
stem ++ "kuna".
|
||
|
||
Definition suffix_ta (stem:mform) : mform :=
|
||
stem ++ "ta".
|
||
|
||
(** ** The structure of morphological paradigm entries
|
||
|
||
An mtrip is a data structure used to define Fentrys, which are
|
||
"form entries" aka morphological paradigm entries. There could
|
||
potentially be many non-Fentrys that have the type of an
|
||
mtrip. Nothing constrains this type to only be reasonable
|
||
entities. One of the purposes of the theory is to specify what
|
||
morphological forms are valid in a particular language. This is
|
||
similar to the formal language notion of defining the characteristic
|
||
function of a subset of all string combinations that correspond to a
|
||
grammar.
|
||
|
||
Here the '*' corresponds to ×, for defining the types of pairs. *)
|
||
|
||
Definition mtrip := mcat * mform * lexeme.
|
||
|
||
(** The check below demonstrates why simply saying that we're working
|
||
with mtrips is not good enough. Non-grammatical triples check. We need
|
||
to be able to specify which are legal triples. *)
|
||
|
||
Check ( [ kuna ] , "foot" , WARMI ).
|
||
|
||
(** Here I define Fentry, which is a predicate that is proveable when
|
||
an mtrip is a valid lexical entity. *)
|
||
|
||
Axiom Fentry : mtrip -> Prop.
|
||
|
||
(** Here I define the mtrip for the "warmi" stem. I am not handling
|
||
the phonological dimension that "sipas" requires, where -s cannot be
|
||
suffixed to a stem that ends in a consonant because I haven't built a
|
||
phonological abstraction into this example, yet. I could specify the
|
||
needed categorical distinction with a predicate. Such a predicate
|
||
would be true for some specified set of mforms and false for others
|
||
but it is more interesting to do it correctly and pattern match on the
|
||
string. *)
|
||
|
||
Definition warmi_nbase := ([ nbase ], "warmi", WARMI) : mtrip.
|
||
|
||
(** This requires an explicit axiom to be treated like an
|
||
Fentry. Other Fentries will be derived from it. *)
|
||
|
||
Axiom f_warmi_nbase : Fentry warmi_nbase.
|
||
|
||
(** * Morpho-Lexical relations
|
||
|
||
lem, for "less than or equal to morphological category" is the
|
||
order over morphological categories, or mcats. What you see below are
|
||
the basic properties of an order. These state the relation is
|
||
reflexive, meaning that x ≤ x; antisymmetric, meaning that if x ≤ y
|
||
and y ≤ x then x = y; transitive, meaning that if x ≤ y and y ≤ z then
|
||
x ≤ z. *)
|
||
|
||
Axiom lem : mcat -> mcat -> Prop.
|
||
Axiom lem_refl : forall x : mcat, lem x x.
|
||
Axiom lem_antisym : forall x y : mcat, lem x y -> lem y x -> x = y.
|
||
Axiom lem_trans : forall x y z : mcat, lem x y -> lem y z -> lem x z.
|
||
|
||
(** The order is not natural. There may be natural orders on the
|
||
underlying list type based on length but this is not how I use
|
||
mcats. As mentioned, though their structure is complex, they are
|
||
*names* of atomic categories.
|
||
|
||
Due to the fact that the order is not natural, each relation must be
|
||
explicitly specified, except those that are predictable from the
|
||
properties of the order. Below I specify those needed for this
|
||
fragment. *)
|
||
|
||
Axiom nbase_lem_kbase : lem [ nbase ] [ kbase ].
|
||
Axiom s_nbase_lem_kbase : lem [ s ; nbase ] [ kbase ].
|
||
Axiom nbase_lem_sbase : lem [ nbase ] [ sbase ].
|
||
Axiom kuna_nbase_lem_sbase : lem [ kuna ; nbase ] [ sbase ].
|
||
Axiom kuna_nbase_lem_kuna : lem [ kuna ; nbase ] [ kuna ].
|
||
Axiom kuna_s_nbase : lem [ kuna ; s ; nbase ] [ kuna ].
|
||
Axiom s_kuna_nbase : lem [ s ; kuna ; nbase ] [ kuna ].
|
||
Axiom kuna_lem_mplural : lem [ kuna ] [ mplural ].
|
||
Axiom s_nbase_lem_mplural : lem [ s ; nbase ] [ mplural ].
|
||
|
||
(** ** Morphotactic rules
|
||
|
||
In using the word "morphotactic", it may imply an imperative form
|
||
building interpretation. Though rules in this system tend to take as
|
||
input simpler forms and output more complex forms, the opposite is
|
||
possible. Defining rules in a form building direction is done for
|
||
practical reasons. Longer more complex forms are often less
|
||
frequent. If one wishes to base the assumed forms of a fragment on
|
||
something that is independently observable and to be consistent about
|
||
which forms are assumed as axioms across lexemes, frequently used
|
||
forms are preferable -- or perhaps something like the principal parts
|
||
of a paradigm. The intended interpretation of these rules is that they
|
||
declare that if it is the case that Fentry x exists, then Fentry y
|
||
exists but x does not underly y in any sense. Nor is it necessarily
|
||
contained as a subpart of y.
|
||
|
||
Below one can see the first morphotactic rule, which I call form-form
|
||
mappings. It specifies that anything within the category of [ sbase ]
|
||
can have its form suffixed with -s. The accessors that look like (fst
|
||
(fst mt)) are because the mtrip triple is actually a pair of a pair
|
||
and a lexeme, the structure is w=((x,y),z) so (fst (fst w)) is x. In
|
||
English the rule says, for any mtrip that is an Fentry, if the mcat is
|
||
less than or equal to [ sbase ] then there is a Fentry such that the
|
||
category is whatever was provided with matom s added to the front, an
|
||
mform with an affixed -s and an unchanged lexeme. *)
|
||
|
||
Axiom add_s : forall
|
||
(mt : mtrip), (Fentry mt) ->
|
||
(lem (fst (fst mt)) [ sbase ]) ->
|
||
Fentry (cons s (fst (fst mt)),
|
||
suffix_s (snd (fst mt)),
|
||
(snd mt)).
|
||
|
||
(** The rule below is very similar except that it is for the -kuna
|
||
containing forms. *)
|
||
|
||
Axiom add_kuna : forall
|
||
(mt : mtrip), (Fentry mt) ->
|
||
(lem (fst (fst mt)) [ kbase ]) ->
|
||
Fentry (cons kuna (fst (fst mt)),
|
||
suffix_kuna (snd (fst mt)),
|
||
(snd mt)).
|
||
|
||
(** The below specifies the morphological category for accusative
|
||
marked plurals. *)
|
||
|
||
Axiom add_plta : forall
|
||
(mt : mtrip), (Fentry mt) ->
|
||
(lem (fst (fst mt)) [ mplural ]) ->
|
||
Fentry ([ s ; ta ],
|
||
suffix_ta (snd (fst mt)),
|
||
(snd mt)).
|
||
|
||
(** The below specifies the morphological category for accusative
|
||
marked singulars. *)
|
||
|
||
Axiom add_ta : forall
|
||
(mt : mtrip), (Fentry mt) ->
|
||
(lem (fst (fst mt)) [ nbase ]) ->
|
||
Fentry ([ ta ],
|
||
suffix_ta (snd (fst mt)),
|
||
(snd mt)).
|
||
|
||
(** * Mock-up of the interface
|
||
|
||
Below I define some types for syntactic categories but I do not supply
|
||
full entries. This is because for the purposes of this simple
|
||
exposition it doesn't make sense to embed the actual types and terms
|
||
needed for Linear Categorial Grammar, which is the syntactic theory
|
||
that I assume.
|
||
|
||
In order to hold somewhat to the spirit of the complete theory, I
|
||
define Sentry which is the predicate for valid lexical entries. *)
|
||
|
||
Inductive syncat : Set :=
|
||
| nom : matom
|
||
| nom_pl : matom
|
||
| nom_foc_pl : matom
|
||
| acc : matom
|
||
| acc_pl : matom.
|
||
|
||
Axiom Sentry : syncat -> Prop.
|
||
|
||
(** ** Interface rules
|
||
|
||
I call these form-sign mappings because the lexical entries are called
|
||
signs in LCG.
|
||
|
||
The first maps uninflected forms to the unmarked nominative. The
|
||
second, assuming that the -kuna forms are not always focal, which is
|
||
consistent with corpora, maps all mplurals to nom_pl. The third maps
|
||
the kuna forms to nom_foc_pl. The last two map -ta marked forms to the
|
||
acc and acc_pl, respectively. *)
|
||
|
||
Axiom nbase_to_nom : forall
|
||
(mt : mtrip), (Fentry mt) ->
|
||
(lem (fst (fst mt)) [ nbase ]) -> Sentry nom.
|
||
|
||
Axiom mplural_to_nom_pl : forall
|
||
(mt : mtrip), (Fentry mt) ->
|
||
(lem (fst (fst mt)) [ mplural ]) -> Sentry nom_pl.
|
||
|
||
Axiom kuna_to_foc : forall
|
||
(mt : mtrip), (Fentry mt) ->
|
||
(lem (fst (fst mt)) [ kuna ]) -> Sentry nom_foc_pl.
|
||
|
||
Axiom ta_to_acc : forall
|
||
(mt : mtrip), (Fentry mt) ->
|
||
(lem (fst (fst mt)) [ ta ]) -> Sentry acc.
|
||
|
||
Axiom s_ta_to_acc_pl : forall
|
||
(mt : mtrip), (Fentry mt) ->
|
||
(lem (fst (fst mt)) [ s ; ta ]) -> Sentry acc_pl.
|