Formal Phonology* David Odden OSU Abstract Two problematic trends have dominated modern phonological theorizing: over-reliance on machinery of ...

It has been thought that evidence for a theory of foot-construction can be found via a forced-choice perceptual experiment to see whether humans have a propensity to organise streams of synthesized tones so that the acoustically prominent member is judged to be last in a binary pairing of beats. As expressed in Hayes (1985), “prominence contrasts based on duration lend themselves to iambic grouping, while prominence contrasts based on intensity lend themselves to trochaic grouping”, a generalization referred to as the “Iambic-Trochaic Law” in Hayes (1987). Under certain experimental conditions, listeners prefer the grouping (X X:) where the underscored beat is judged “strongest” in case one beat is longer, but (X X) in case the beats have equal duration. The prediction of the psychological-test driven theory is that a durationally-asymmetric organization (X: X) is not a possible parsing of beats, since according to the IambicTrochaic law, an initially-prominent beat should not be longer than the following beat.

Suppose that we assume the empirical correctness of the perceptual claim. Suppose furthermore that no language actually had Heavy-Light trochees. Following the inventory-driven nature of certain claims about foot structure (that Universal Grammar provides a list of foot types which excludes the parse (HL)), the linguistic question at stake would be whether (HL) is indeed a computationally-possible representation. If the assumptions embodied in the Iambic-Trochaic Law are correct, putative non-existence of Heavy-Light trochees is logically explicable on two bases. One, the basis that FP insists on, is that HeavyLight trochees are computationally possible, though functionally unattestable or rare. The other, which typifies the substance-driven approach to grammar, holds that the theory of grammar contains a restatement of the Iambic-Trochaic Law, perhaps in the form of a list of allowed foot types.

If the results of psychological testing are to be relevant for deciding whether (HL) is a computationally-possible representation, then the underlying mechanism causing the behavior of experimental subjects must be a principle of grammar. However, it is patently obvious that the behavioral patterns said to support the law are not the result of a linguistic principle at all, and are most relevant to music, since the stimuli are synthetically manipulated tones with no significant resemblance to speech. The cause of the behavior is the result of something external to language, perhaps reflecting a strategy for reacting to the requirement to find a “strong” beat, when there is no independently perceivable rhythmic structure to the beats. When a (supposed) fact has an extragrammatical cause, that fact does not constitute evidence for adding a grammatical principle – the evidence may even show that grammatical theory should say nothing about the matter.

Experimental findings might in principle constitute evidence for a linguistic theory, if experimental evidence convincingly demonstrates a general fact about the nature of human cognition which directly There are reasons to doubt the claim. The literature cited by Hayes indicates that sequences with even duration and uneven amplitude tend to get a trochaic parse. Rice (1992) demonstrates the same preference for beats of even duration and amplitude – amplitude turns out to be irrelevant. Rice also demonstrates that higher pitch with equal duration tends to receive an iambic parse. Thus duration-difference is not the trigger for the parsing difference, more generalized “prominence” is. In pilot experiments, I have found that inversely correlating pitch and duration, where sequences of beats have the shape “long-low + short-high”, also yields a trochaic parsing judgment, again suggesting that inequality of prominence (not duration) is the triggering factor for iambic parsing. In sequences of “long-low + short-high”, both beats have some prominence.

Since linguistic stress usually correlates with higher pitch, stressed syllables are, on the surface, prominent. It follows, then, that (X X:) i.e. classical iambic length-distribution but foot-initial stress would also be a “natural” type of trochee – linguistically speaking, it seems to be a non-existent pattern. Finally, pilot experimental evidence indicates that the results depend crucially on a particular experimental setup, where beats are evenly separated and amplitude is tapered so that listeners have no idea how the sequence begins or ends. When the setup is changed so that listeners know whether the sequence starts (short-long) or (long-short) – as is the uniform case in natural language – then English speakers, at least, identify the long beat as being “strong”, and correctly place the strong beat group-initial or groupfinal, depending on how the sequence begins.

See Rice (1992), Mellander (2003) for languages with (HL) feet.


implies a choice between linguistic theories. If a hypothesized grammatical principle fundamentally contradicts a basic fact about human cognition, the grammatical principle cannot be correct since grammar is an aspect of cognition. If a proposed gramatical principle is simply different from what has been seen in other areas of cognitive science, then we may have discovered something interesting about language.

Here is an imaginable scenario of that type. The previous section has discussed competing theories of the nature of feature variables, Identical-Value where values and features are inextricably linked, and Value-Variable theory where value can be factored out and applied to a different feature. The applicable psychological question is whether the mind actually abstracts value from attribute, and can graft one value onto another attribute. It is obvious that humans can sensibly compare the weight of two objects, or their colors or temperatures – we can compare the value of one given attribute between two entities. It is not sensible to say, except jocularly, that “This stone is as heavy as that book is blue”. Experimental psychological evidence might imaginably establish that the mind does not treat values as a floating abstraction detached from an attribute. If such a result regarding nonlinguistic cognition could be established, then the results of psychological testing could in principle show that a hypothesized linguistic concept contradicts what is known about the mind, giving evidence for Identical-Value over Value-Variable Theory.

Arguments based on properties of the mind have to be treated cautiously, as indicating a potentially fruitful source of extralinguistic evidence about cognitive foundations, if the foundations can be firmly established. Those theoretical foundations are not yet firmly established, so arguments based on properties of cognition may be suggestive, but not probative.

Certain experiments might provide evidence about grammar, namely those which directly call on grammar. The classical example is the wug-test, where subjects are manipulated to create a certain linguistic input, and then an output form is elicited. That output tests some theory about an aspect of the grammar. Thus when an English-speaking subject is presented with an object named [wʌg] and asked (indirectly) for the plural, the form [wʌgz] is the response, and likewise [lʌp] should be found to have the plural [lʌps]. The wug-test indirectly taps into the grammatical system, by giving the subject an opportunity to combine a conjectured form (/wʌg, lʌp/) with a highly-probably strategy for forming plurals (affix /-z/). The underlying forms are virtual certainties. In English, [wʌg, lʌp] could only derive from /wʌg, lʌp/ though in some languages, an output [bunt] might come from /bunt/ or from /bund/ so that producing [bunt] does not provide the subject with enough information to uniquely select the underlying form), and there are only a few lexical alternatives to the plural affix /-z/, exemplified by mice, sheep, children.

Results from such tests must be used cautiously. An unpredictable output may reflect a fact about the grammar, or simply a problem with the subject’s ability to cope with a counterfactual research method (for instance, stipulating that there is such a bird-like thing with that English name). Wug tests carry the added burden that the subject must effortlessly adopt a new underlying form, and must actually unconsciously apply the phonological rules of the language (does not semi-consciously compute a response based on their memory of spelling and grammar rules from elementary school). In normal language use, we rightly assume that speakers are unconsciously calling on their internalized grammar to generate and interpret utterances. In an experimental setup where subjects are being quizzed on their ability to form plurals of non-words, we cannot assume that production is unalloyed by subjects’ strategies for not looking like they don’t know how to spell or talk right, therefore the experimental setup needs to be subtle.

Interpreting wug-data is similar to interpreting elicited regular-language data, which field workers do all the time. It is well-known to field workers that individuals vary in their ability to generate forms in response to a stimulus, and it may take some practice at performing the task for a speaker to actually understand what the scientist is looking for. In a field-work context which lasts for months or years, these A relevant experiment would have to test whether such an ability varies between humans, and requires more than ordinary inductive reasoning to acquire. The ability to automatically acquire language by observation of one’s surroundings is uniform in humans, whereas the ability to construct mathematical and scientific theories is a special talent possessed by a small fraction of the population.


research start-up effects have negligible impact on the resulting data. In the context of half-hour long psychological tests, start-up effects will be quite substantial, and will always cast doubt on the claim that the test data reflect a fact about the language, rather than an effect of the experimental setup. Just as fieldworkers know that speaker productions have to be evaluated critically in terms of the question whether aspects of production result from competence versus performance, “laboratory phonologists” also need to evaluate speech behavioral evidence critically, and not assume that grammar and behavior are the same thing. Grammar underlies behavior, and is not the sole contributing factor.

Another kind of potentially valid grammar-external data comes from language games a.k.a ludlings.

See Bagemihl (1988, 1995), Vaux (2011) for phonological overviews. Such games have, in the past, revealed a number of interesting facts about phonological structure by validating abstract underlying representations, the existence of certain phonological rules, or supporting a representational claim regarding prosody versus segments. The characteristic operation defining the game is, apparently universally, a transformation of a linguistic form that resembles word-formation processes (movement, infixation) but one which is never employed in that form in ordinary language (infixation after every syllable; random transposition of segments; long-distance segment movement a la Pig Latin).

It is not clear whether the fundamental operation defining the language game is within the domain that phonology is responsible for, in part because it isn’t even clear what the proper analysis of morphological metathesis, infixation and reduplication are. The fact that language games often involve insertion of a CV sequence everywhere does not per se mean that the phenomena are beyond the reach of phonological grammars, since phonological grammars need to account for the insertion of CV sequences in specific locations: the peculiarity of language-game formation seems to reside in the extent to which the operation takes place, not in what kind of operation takes place.

The most uncontroversial valid evidence from language games lies in how a game-transformation interacts with the grammar, thus it is important to distinguish the mechanism of the change from the consequences of such a change. Al-Mozainy (1981) documents a language-game in Bedouin Hijazi Arabic where root consonants are freely transposed (thus /dfʕ/ → [fdʕ], [ʕdf], [fʕd] etc). The transformation itself is not evidence that phonology includes random segment moving as an operation. The relevance of the language-game facts lies in how that transformation interacts with independently-motivated aspects of the grammar. For example, regular-language /ð ̣aṛab/ surfaces as [ð ̣aṛab] ‘he hit’, where a height-dissimilation rule of the language does not affect initial /a/ because the intervening consonant is /ṛ/, which regularly blocks raising. The language game’s transposition can alter the intervening consonant, and in the language game, the form appears as [ṛibað ̣], [bɨð ̣aṛ], [ṛɨð ̣ab], [ð ̣ibaṛ] and [baṛað ̣], exactly as predicted by applying the independently motivated rules of the phonology to the output of the language-game transformation. Similar evidence from a Tigrinya ludling presented in Bagemihl (1988) provides confirming evidence for the underlying form of the root-final consonant and for the reality of the postvocalic velar spirantization process (which does not affect geminates). In the regular language, /sanduk’-ka/ becomes [sandukka] ‘your m.s. box’ via a laryngeal-assimilation rule which creates geminates from /k’+k/, which blocks spirantization. The ludling inserts /gV/, resulting in [saganɨgɨdugux’ɨgɨkkaga], independently showing the reality of underlying /k’/ and the spirantization rule. The value of such evidence is that a simple operation feeds into the system of phonological computations in a revealing way.

To take reduplication as the best-known example, there are numerous mutually incompatible theories of what object is concatenated with a stem, to trigger phonological copying, and how it is concatenated. The classical templatic approach posits that reduplicants are partially-defined phonological strings such as “σ”, “F”, “CV”, whereas the OT approach posits a single diacritic entity “RED” whose shape is governed by constraints.

Independent of language games, synchronic metathesis demonstrates that segments move.

This is not to imply uncritical acceptance of Bagemihl’s analysis of the Tigrinya ludlings or his theory of language games. The point is solely to indicate potentially useful evidence from language games.

