2.1.1, Excursus: other theories

In the last section, I explained stuttering as caused by invalid error signals in the monitoring system, resulting in interruptions of speech flow. A similar explanation was proposed by Maraist and Hutton (1957): In stuttering, a person “misevaluates his own speech output at some point in the control system and finds error where, in reality, no error exists” (cf. Bloodstein & Bernstein Ratner, 2008, p. 296). However, Maraist and Hutton believed that stuttering was caused by the attempt to repair those nonexistent errors. By contrast, I think the blockage of speech flow because of an invalid error signal is an automatic brain response independent of the speaker’s will. The speaker does not attempt to repair an error, instead, he or she attempts to continue talking, and just this causes the observable stuttering symptoms.

The explanation that stuttering occurs because of invalid error signals has also some similarity with the Covert Repair Hypothesis (Postma & Kolk, 1993); however, they believed that the error signals were valid: A real error in speech planning is detected in pre-articulatory monitoring and is covertly repaired by the Formulator (the theory depends on Levelt’s model of speech processing). Those covert repairs that are unconscious to the speaker allegedly cause the stutter. By contrast, I do not assume any error of speech planning in stuttering. The reason why the monitoring system reacts in the same way as to a speech error is: It only compares two sequence structures, without being able to distinguish between a mismatch due to a speech error and a mismatch due to a feedback disruption.

Vasic and Wijnen (2001; 2005) proposed a variation of the Covert Repair Hypothesis – in my view, it is also a variation of Maraist and Hutton’s (1957) approach: They assume that normal speech disfluencies are misevaluated as errors by anoversensitive monitoring system, and the speaker’s attempt to avoid those normal disfluencies causes stuttering. I discuss this hypothesis more extensively in Section 2.3 (here).

Hickok, Houde, and Rong (2011) assume that stuttering is caused by invalid error signals. These invalid error signals, so the authors propopose, result from inaccurate, “noisy” mapping between the internal model of the vocal tract and the sensory system. “This results in a sensory-to-motor ‘error’ correction signal, which itself is noisy and inaccurate. In this way, the system ends up in an inaccurate, iterative predict-correct loop that results in stuttering.” (p. 13). This theory has some similarity with the Covert Repair Hypothesis as it does not assume a mismatch between sensory prediction and external sensory, i.e., auditory feedback, but between sensory prediction and an internal target representation. That is, they assume an invalid error signal in the internal feedback loop.

Invalid error signals as the cause of stuttering were also proposed by Tian and Poeppel (2012). They assume “that one of the neural mechanisms causing stuttering is a deficit in the motor-to-sensory transformation. That is, the noisy perceptual estimation is mismatched to the external feedback. Such a discrepancy would signal an incorrect error message, and the feedback control system would interpret such an apparent error as the requirement to correct motor action. Hence, unnecessary attempts would be performed to modify the correct articulation, resulting in repetitive/prolonged sound or silent pauses/blocks.” (page 7 in the PDF version). A similar explanation had already been proposed by Max et al. (2004) on the basis of the DIVA model. However, a theory of inaccurate predictions amounts to the claim that stutterers do not exactly know how their words should sound like.

Additionally, we should consider that the theories last mentioned are based on computer modeling, but there is a difference between a computer and a brain: A “noisy” computer process may anyway produce incorrect predictions – although I think that a computer doesn’t miscount, thus a specific program is needed, e.g., for the simulation of an unstable mapping between the internal model of the vocal tact and the sensory system, which enables the computer to randomly produce ‘wrong’ reaults. But assuming, for example, the correct prediction of a phoneme sequence would be /cat/, but the computer model predicts /caf/. In this case, the monitor in the computer model detects a mismatch when the correct sequence /cat/ is fed back, and emits an invalid error signal. But what happens if a human has no clear notion of what the word ‘cat’ should sound like, thus his brain generates an only vague or cloudy expectation of its sound sequence? The person will possibly fail to detect an error, if the word is mispronounced. But the person, if pronouncing the word correctly, will hardly produce a specific incorrect prediction like /caf/ and take the correct feedback /cat/ for an error (read more).

Briefly said: Incorrect predictions should impair the self-monitoring of speech in general and hamper the detection of real speech errors, instead of eliciting invalid error signals. Additionally, a theory of incorrect predictions cannot answer the question: Why does stuttering not occur when a young child begins to speak, in the babbling period or in the period of one-word utterances, i.e., at a time at which the mapping between the internal model of the vocal tract and the sensory system, in fact, may still be unstable? As is well known, stuttering typically onsets later, namely at the time when children start forming sentences.

A further objection: Every theory that explains stuttering as a brain response to auditory feedback has additionally to explain how stuttering is caused at the onset of an utterance, a pattern typical in early childhood stuttering. At the onset of an utterance, auditory feedback cannot yet have influenced speech control. The theory proposed by Hickok, Houde, and Rong (2011), Max et al. (2004), and Tian and Poeppel (2012) does not provide an answer to this question. Finally, it might be difficult to prove the thesis of an unstable or “noisy” mapping between the internal model of the vocal tract and the sensory system.

In order to provide evidence for incorrect auditory predictions in stutterers, Daliri and Max (2015a, 2015b) recorded stuttering and non-stuttering adults’ auditory evoked potentials in response to probe tones that were presented while the participants were anticipating either speaking aloud or hearing one’s own speech played back and, as a control conditions, while silent reading or looking at nonlinguistic symbols. N1 amplitude of the controls, but not of the stutterers was reduced prior to both speaking and hearing, compared to the control condition. The authors conclude from their findings that stutterers may have general auditory prediction difficulties. However, this may simply mean that their sensory system does not expect an auditory input when they start speaking, that is, attention is not allocated in the appropriate manner for the processing of auditory feedback (read more in this blog post about pre-speech auditory modulation (concerning the new theory proposed by Max and Daliri, 2019).

A further feedback theory of stuttering was proposed by Chang and Zhu (2013). Referring to findings of structural deficits in white matter tracts interconnecting frontal motor and posterior auditory areas in the left brain hemisphere of stutterers (see Section 4.1) they wrote: “Insufficient white matter integrity between these regions may lead to subtle inefficiencies in one’s ability to match the auditory target associated with one’s own motor execution (articulation) to actual auditory feedback. If a mismatch occurs between the intended (predicted) auditory target of the speech produced and the actual auditory feedback, the auditory cortex sends corrective signals to the motor system to modify the motor programme for subsequent articulations (…).” (page 14 in the PDF version). They don’t explicitly say it, but Chang and Zhu seem to mean (I understand them so) that such a mismatch, if occurring because of an insufficient integrity of the fibers (and not because of a real articulatory error), can result in an invalid correction signal to the motor system and to a wrong modification of the motor program for subsequent articulation, leading to disfluencies.

These assumptions are close to my theory. What is the difference? First, Chang and Zhu assume stuttering to result from a disruption of the feedback-based online (‘within the flow’) control of speech – so they refer to Cai et al.(2012) who examined the online control of speech based on auditory feedback (auditory-motor integration). By contrast, I assume that stuttering is a disorder of an ‘offline correction’ mechanism: Stuttering results from an invalid error signal that, if it was valid, would lead to an interruption of speech flow in order to enable an error repair, that is, it would lead to a correction ‘without the flow’. Second, I do not assume that the white matter tracts interconnecting frontal motor and posterior auditory areas are unable to work well (read more in Section 4.1), and also Chang, Zhu, Choo, and Angstadt (2015) write: “It is well known that regardless of stuttering severity, most people who stutter can, from time to time, speak completely fluently, suggesting that fundamental auditory-motor integration for speech production is present and functional.” (page 14 in the PDF). If stuttering was the result of insufficient white matter integrity, it should be a less variable disorder, less influenceable by speech situations and linguistic factors, and symptoms should be distributed more randomly over speech.

Two hypotheses must be mentioned that are similar to my theory insofar as they consider stuttering a disorder of the sequencing of motor programs: Civier, Bullock, Max, and Guenther (2013) tested two hypotheses by means of the GODIVA computer model: They investigated whether atypical white-matter integrity or elevated dopamine levels may lead to speech dysfluencies due to their effects on a syllable-sequencing circuit that consists of basal ganglia, thalamus, and left ventral premotor cortex. In other words, they tested two disfluency mechanisms: “a failure to cancel the activation of the previous syllable, and a failure to bias cortical competition in favor of the next syllable”. Simulation results supported both hypotheses: both scenarios resulted in a delayed start of the next syllable’s motor program.

My points of criticism are: (1) Sensory feedback played no role in this computer simulation – both hypothesized mechanisms impaired only the feedforward control of speech. Therefore, they can hardly account for the impact of altered auditory feedback and auditory masking on stuttering. (2) A delayed activation of a motor program (which was the result of both computer simulations) does not meet the stutterer’s experience of being blocked. Furthermore, in the case of repetitions and prolongations, the motor program of the syllable has (!) started, and the initial phoneme(s) has/have been produced, and only then the program falters, either starting again and again or being stuck at a sound that can be prolonged. (3) Both hypotheses describe disturbances in the left premotor-motor network, including basal ganglia and thalamus – disturbances that should impair not only speaking but also other kinds of well-learned sequential behavior, e.g., writing, typing, or playing a music instrument – which is usually not the case with people who stutter. (4) In both scenarios, stuttering is directly caused by a physiological deficiency (atypical white-matter integrity or elevated dopamine level), which can hardly account for the variability of the disorder and for the influence of psychological factors (situation, environment, anticipation of stuttering) and linguistic factors (sentence position, length, and information load of words). Overall, the explanatory power of the two hypotheses might be little – compare the questions a causal theory of stuttering should answer.

The above theories show that several researchers thought or think in the same direction as I do, namely that stuttering is not a simple breakdown of speech control, but a response of the control to error signals that have to do with auditory feedback anyway. Future research and theoretical modeling must reveal which of the theories is the best one: Which is in agreement with the empirical findings and has the greatest explanatory power in regard to the many well known features of the disorder.

A different but very influential theory has been proposed by Alm (2004, 2006), namely that “the basal ganglia-thalamocortical motor circuits through the putamen are likely to play a key role in stuttering. The core dysfunction in stuttering is suggested to be impaired ability of the basal ganglia to produce timing cues for the initiation of the next motor segment in speech.” (Alm, 2004, p. 325).

He has further explained stuttering in the framework of the dual premotor model (Goldberg, 1985): As a voluntary, internally cued behavior, speaking should be controlled mainly by the ‘medial premotor system’, i.e,, by the SMA-basal ganglia circuit. Because of the impaired basal ganglia function, the ‘lateral premotor system’, i.e., the cerebellar circuit, tries to compensate for the deficit and becomes dominant in speech control. The lateral premotor system, however, controls motor behavior depending on sensory input (i.e., not on the basis of internal cues, as the basal ganglia do), and consequently, speech becomes dependent on (external cues from) auditory feedback (Alm, 2006).

I do not assume that the medial premotor system is impaired in stuttering, but that the lateral premotor system is impaired because it does not get the needed sensory input. The result is interruptions of speech flow, which the speaker automatically tries to overcome by the will – and here, the medial premotor circuit with the basal ganglia comes into play (see below). A dysfunction in the lateral premotor system was already concluded by Watson and Freeman (1997) from the results of brain imaging and behavioral studies (read more).

Basal ganglia may play a double role in developmental stuttering: First, they contribute to the predisposition for stuttering: There might be a relationship between high dopamine level in the basal ganglia and heightened activity of the voluntary motor system. Children in general have a peak in the number of dopamine receptors type D2 in the striatum at the time when stuttering typically onsets (Alm, 2004, Fig. 2); this is an age in which they also show a strong bias towards motor activity. Further, voluntary motor behavior is always associated with selective (goal-directed top-down) attention, thus a dominance of the medial premotor system might be associated with an imbalance in attention allocation. A role of the basal ganglia in the predisposition for stuttering is suggested by the results of Metzger et al. (2018), who found a correlation between substantia nigra activation and stuttering severity in a non-speech motor task, and by altered functional connectivity between basal ganglia and cortical regions (Lu et al., 2009, 2010; Chang and Zhu, 2013),

Second, basal ganglia activity may determine the severity of stuttering and (in part) the kind of overt symptoms: After a speech motor program has been blocked because of an invalid error signal from the cerebellum, the speaker automatically tries to overcome the blockage and to continue talking. This behavior is driven by the SMA-basal ganglia-motor circuit, as it is depending on the speaker’s will. In stuttering modification therapy, the patient learns to suppress his ‘speech drive’ when feeling an internal blockage and, in this way, to reduce or to avoid overt stuttering symptoms. Dopamine blockers like haloperidol seem to operate in a similar way: “The drug seems to exert its main effect on the severity of stuttering behavior and not so much on the frequency of stuttering” (Alm, 2004, p. 337), Thus, basal ganglia may be responsible for overt stuttering behaviors and their severity without being the underlying trigger (read more).


to the top

next page


Computer simulation of brqin processes

The crucial thing is: humans can actually have imprecise knowledge, imprecise thinking, vague ideas and expectations – computers can only simulate this. The results of such simulation, however, are not vague, but are always specific, even if they are wrong. Suchlike wrong but specific predictions are the basis of incorrect error signals in the theories mentioned above. By contrast, humans will hardly produce specific wrong predictions of how a word of their native tongue should sound like. (return)

Parkinson’s disease, neurogenic stuttering

Stuttering has often been compared with Parkinson’s disease. I think, that’s wrong: Parkinson’s results from a weak basal ganglia function in mostly old people, developmental stuttering is rather related to a strong or a bit too strong basal ganglia function in young children and hyperactive people. Haloperidol reduces basal ganglia activity as well as stuttering, but can evoke not only languidness, but also Parkinson’s-like symptoms (tremor and bradykinesia) as a side effect (Kurz et al, 1995). On the other hand, Juste and Andrade (2017) who had investigated the disfluent speech in patients with Parkinson’s, conclude that the change in their speech cannot be considered a stuttering disorder, because the percentage of stuttering-like disfluencies did not reach 3% on average – a parameter internationally used for the diagnosis of stuttering.

Neurogenic stuttering after a lesion in the BG seems to be another case. Here, stuttering seems to result only from incorrect, repetitive internal cues for starts of speech motor programs, but without concurrent inhibition of these programs by the cerebellum (as assumed in developmental stuttering). That’s why tension and struggle behavior typically lack in neurogenic stuttering (Lundgren, Helm-Estabrooks, & Klein, 2010). Interestingly, neurogenic stuttering behavior is constant across speech and speech tasks (Lundgren, Helm-Estabrooks, & Klein, 2010), that is, it lacks all the features that I ascribe to the fact that the speaker’s attention is drawn from the auditory channel at certain sentence positions, words, or in certain communication situations (see Section 2.2 and 2.5). An audio example of acquired neurogenic stuttering is available here. (return)

Watson and Freeman (1997)

Watson and Freeman (1997) discussed studies of regional cerebral blood flow (rCBF) and studies of speech motor performance in which acoustic laryngeal reaction time (LRT) was measured as a function of the complexity of the required response, e.g., a word or sentence. They wrote:

“The rationale for the LRT studies arose, in part, from Goldberg’s (1985) discussion of medial and lateral premotor systems. The medial system is hypothetically related to spontaneous, prepositional speech and has connections to cingulate cortex. The lateral system is hypothetically related to nonpropositional, repetitive speech or speech guided by auditory self monitoring. This system functions in a responsive mode to external stimuli and has connections to auditory association areas in temporal cortex classically related to speech processing (Kent, 1984; Penfield & Roberts, 1976). The stimulus-dependent nature of the LRT task and manipulation of the linguistic complexity of the response should preferentially involve the lateral premotor system in Goldbergs model. […] Relations between resting rCBF anomalies and speech motor performance deficits are consistent with predictions of Goldberg’s (1985) model regarding defects in the lateral premotor system ” (p. 343f) (return)

to the top

next page