In the last section, I explained stuttering as caused by invalid error signals in the monitoring system, resulting in interruptions of speech flow. A similar explanation was proposed by Maraist and Hutton (1957): In stuttering, a person “misevaluates his own speech output at some point in the control system and finds error where, in reality, no error exists” (cited in Bloodstein & Bernstein Ratner, 2008, p. 296). However, Maraist and Hutton believed that stuttering was caused by the attempt to repair those nonexistent errors. By contrast, I think the blockage of speech flow because of an invalid error signal is an automatic response, according to Levelt’s Main Interruption Rule. The speaker does not attempt to repair an error, instead, he or she attempts to continue talking, and just this causes the observable stuttering symptoms.
The explanation that stuttering occurs because of an invalid error signal also has some similarity with the Covert Repair Hypothesis (Postma & Kolk, 1993); however, they believed that the error signal was valid: A real error in speech planning is detected in a pre-articulatory monitoring and is covertly repaired by the Formulator (the theory depends on Levelt’s model of speech processing). Those covert repairs that are unconscious to the speaker cause the stutter. By contrast, I do not assume any error of speech planning in stuttering. The only cause that the monitoring system reacts in the same way as to a speech error is that it only compares two sequence structures, without being able to distinguish between a mismatch due to a speech error and a mismatch due to a feedback disruption.
Vasic and Wijnen (2001; 2005) proposed a variation of the Covert Repair Hypothesis – in my view, it is more a variation of Maraist and Hutton’s (1957) approach: They assumed that normal speech disfluencies are misevaluated as errors, and that the speaker’s attempt to avoid them causes stuttering. I discuss this hypothesis more extensively in Section 2.3 (here).
Likewise, Hickok, Houde, and Rong (2011) proposed a theory not unlike the hypothesis by Maraist and Hutton (1957): They assume that invalid error signals resulting from a misinterpretation of the feedback cause stuttering. The invalid error signals, that is, a mismatch between expectations and perceptions result from incorrect predictions due to an unstable or “noisy” mapping between the internal model of the vocal tract and the sensory system. “This results in a sensory-to-motor ‘error’ correction signal, which itself is noisy and inaccurate. In this way, the system ends up in an inaccurate, iterative predict-correct loop that results in stuttering.” (p. 13).
The same theory was held by Tian and Poeppel (2012). They assume “that one of the neural mechanisms causing stuttering is a deficit in the motor-to-sensory transformation. That is, the noisy perceptual estimation is mismatched to the external feedback. Such a discrepancy would signal an incorrect error message, and the feedback control system would interpret such an apparent error as the requirement to correct motor action. Hence, unnecessary attempts would be performed to modify the correct articulation, resulting in repetitive/prolonged sound or silent pauses/blocks.” (page 7 in the PDF version). A similar explanation had already been proposed by Max et al. (2004) on the basis of the DIVA model. However, a theory of inaccurate predictions amounts to the claim that stutterers do not exactly know how their words should sound like.
Apparently, the predictions assumed in this approach are a kind of efference copies. Therefore, the crucial (theoretical) question is: Do such copies of motor plans play any role in the self-monitoring of speech or not? I think they don’t – the issue is extensively discussed in Section 1.5 (here).
Additionally, we should consider that the theories last mentioned are based on computer modeling, but there is a difference between a computer and a brain: A “noisy” computer process may anyway produce incorrect predictions – although I think that a computer doesn’t miscount, thus a specific program is needed, e.g., for the simulation of an unstable mapping between the internal model of the vocal tact and the sensory system, which enables the computer to randomly produce ‘wrong’ reaults. But assuming, for example, the correct prediction of a phoneme sequence would be /cat/, but the computer model predicts /caf/. In this case, the monitor in the computer model detects a mismatch when the correct sequence /cat/ is fed back, and emits an invalid error signal. But what happens if a human has no clear notion of what the word ‘cat’ should sound like, thus his brain generates an only vague or cloudy expectation of its sound sequence? The person will possibly fail to detect an error, if the word is mispronounced. But the person, if pronouncing the word correctly, will hardly produce a specific incorrect prediction like /caf/ and take the correct feedback /cat/ for an error (read more).
Briefly said: Incorrect predictions should impair the self-monitoring of speech in general and hamper the detection of real speech errors, instead of eliciting invalid error signals. Additionally, a theory of incorrect predictions cannot answer the question: Why does stuttering not occur when a young child begins to speak, in the babbling period or in the period of one-word utterances, i.e., at a time at which the mapping between the internal model of the vocal tract and the sensory system, in fact, may still be unstable? As is well known, stuttering typically onsets later, namely at the time when children start forming sentences.
A further objection: Every theory that explains stuttering as a brain response to auditory feedback has additionally to explain how stuttering is caused at the onset of an utterance, a pattern typical in early childhood stuttering. At the onset of an utterance, auditory feedback cannot yet have influenced speech control. The theory proposed by Hickok, Houde, and Rong (2011), Max et al. (2004), and Tian and Poeppel (2012) does not provide an answer to this question.
Finally, it might be difficult to prove the thesis of an unstable or “noisy” mapping between the internal model of the vocal tract and the sensory system. In order to provide evidence for incorrect auditory predictions in stutterers, Daliri and Max (2015a, 2015b) recorded stuttering and non-stuttering adults’ auditory evoked potentials in response to probe tones that were presented while the participants were anticipating either speaking aloud or hearing one’s own speech played back and, as a control conditions, while silent reading or looking at nonlinguistic symbols. N1 amplitude of the controls, but not of the stutterers was reduced prior to both speaking and hearing, compared to the control condition.
The authors conclude from their findings that stutterers may have general auditory prediction difficulties. However, we do not know whether,actually a prediction difficulty exists or, instead, the system does only not react with suppression to the anticipation of an auditory input. And interestingly, the amplitudes of the N1 and P2 component of the ERP were generally smaller in the stutterers, as a group, than in the controls, suggesting a difference in central auditory processing.
Daliri and Max’ results are in line with Kikuchi et al. (2011) who found a reduced suppression of the P50m, another early ERP component, in adults who stutter in response to repeated, i.e., expectable click sounds, – but these authors did not assume a prediction difficulty, but a reduced auditory gating in stutterers. There were found more subtle abnormalities in the central auditory processing in both children and adults who stutter – see Section 3.3.
A further feedback theory of stuttering was proposed by Chang and Zhu (2013). Referring to findings of structural deficits in white matter tracts interconnecting frontal motor and posterior auditory areas in the left brain hemisphere of stutterers (see Section 4.1) they wrote: “Insufficient white matter integrity between these regions may lead to subtle inefficiencies in one’s ability to match the auditory target associated with one’s own motor execution (articulation) to actual auditory feedback. If a mismatch occurs between the intended (predicted) auditory target of the speech produced and the actual auditory feedback, the auditory cortex sends corrective signals to the motor system to modify the motor programme for subsequent articulations (…).” (page 14 in the PDF version). They don’t explicitly say it, but Chang and Zhu seem to mean (I understand them so) that such a mismatch, if occurring because of an insufficient integrity of the fibers (and not because of a real articulatory error), can result in an invalid correction signal to the motor system and to a wrong modification of the motor program for subsequent articulation, leading to disfluencies.
These assumptions are close to my theory. What is the difference? First, Chang and Zhu assume stuttering to result from a disruption of the feedback-based online (‘within the flow’) control of speech – so they refer to Cai et al.(2012) who examined the online control of speech based on auditory feedback (auditory-motor integration). By contrast, I assume that stuttering is a disorder of an ‘offline correction’ mechanism: Stuttering results from an invalid error signal that, if it was valid, would lead to an interruption of speech flow in order to enable an error repair, that is, it would lead to a correction ‘without the flow’. Second, I do not assume that the white matter tracts interconnecting frontal motor and posterior auditory areas are unable to work well (read more in Section 4.1), and also Chang, Zhu, Choo, and Angstadt (2015) write: “It is well known that regardless of stuttering severity, most people who stutter can, from time to time, speak completely fluently, suggesting that fundamental auditory-motor integration for speech production is present and functional.” (page 14 in the PDF). If stuttering was the result of insufficient white matter integrity, it should be a less variable disorder, less influenceable by speech situations and linguistic factors, and symptoms should be distributed more randomly over speech.
Two hypotheses must be mentioned that are similar to my theory insofar as they consider stuttering a disorder of the sequencing of motor programs: Civier, Bullock, Max, and Guenther (2013) tested two hypotheses by means of the GODIVA computer model: They investigated whether atypical white-matter integrity or elevated dopamine levels may lead to speech dysfluencies due to their effects on a syllable-sequencing circuit that consists of basal ganglia, thalamus, and left ventral premotor cortex. In other words, they tested two disfluency mechanisms: “a failure to cancel the activation of the previous syllable, and a failure to bias cortical competition in favor of the next syllable”. Simulation results supported both hypotheses: both scenarios resulted in a delayed start of the next syllable’s motor program.
The above theories show that several researchers thought or think in the same direction as I do, namely that stuttering is not a simple breakdown of speech control, but a response of the control to error signals that have to do with auditory feedback anyway. Future research and theoretical modeling must reveal which of the theories is the best one: Which is in agreement with the empirical findings and has the greatest explanatory power in regard to the many well known features of the disorder.
A different but very influential theory has been proposed by Alm (2004, 2006), namely that “the basal ganglia-thalamocortical motor circuits through the putamen are likely to play a key role in stuttering. The core dysfunction in stuttering is suggested to be impaired ability of the basal ganglia to produce timing cues for the initiation of the next motor segment in speech.” (Alm, 2004, p. 325).
He has further explained stuttering in the framework of the dual premotor model (Goldberg, 1985): As a voluntary, internally cued behavior, speaking should be controlled mainly by the ‘medial premotor system’, i.e,, by the SMA-basal ganglia circuit. Because of the impaired basal ganglia function, the ‘lateral premotor system’, i.e., the cerebellar circuit, tries to compensate for the deficit and becomes dominant in speech control. The lateral premotor system, however, controls motor behavior depending on sensory input (i.e., not on the basis of internal cues, as the basal ganglia do), and consequently, speech becomes dependent on (external cues from) auditory feedback (Alm, 2006).
By contrast, I do not assume that the medial premotor system is impaired in stuttering, but that the lateral premotor system is impaired because it does not get the needed sensory input. The result is interruptions of speech flow, which the speaker automatically tries to overcome by the will – and here, the medial premotor circuit with the basal ganglia comes into play (see below). A dysfunction in the lateral premotor system was already concluded by Watson and Freeman (1997) from the results of brain imaging and behavioral studies (read more).
Basal ganglia may play a double role in developmental stuttering: First, they contribute to the predisposition for stuttering: There might be a relationship between high dopamine level in the basal ganglia and heightened activity of the voluntary motor system. Children in general have a peak in the number of dopamine receptors type D2 in the striatum at the time when stuttering typically onsets (Alm, 2004, Fig. 2); this is an age in which they also show a strong bias towards motor activity. Further, voluntary motor behavior is always associated with selective (goal-directed top-down) attention, thus a dominance of the medial premotor system might be associated with an imbalance in attention allocation. A role of the basal ganglia in the predisposition for stuttering is suggested by the results of Metzger et al. (2018), who found a correlation between substantia nigra activation and stuttering severity in a non-speech motor task, and by altered functional connectivity between basal ganglia and cortical regions (Lu et al., 2009, 2010; Chang and Zhu, 2013),
Second, basal ganglia activity may determine the severity of stuttering and (in part) the kind of overt symptoms: After a speech motor program has been blocked because of an invalid error signal from the cerebellum, the speaker automatically tries to overcome the blockage and to continue talking. This behavior is driven by the SMA-basal ganglia-motor circuit, as it is depending on the speaker’s will. In stuttering modification therapy, the patient learns to suppress his ‘speech drive’ when feeling an internal blockage and, in this way, to reduce or to avoid overt stuttering symptoms. Dopamine blockers like haloperidol seem to operate in a similar way: “The drug seems to exert its main effect on the severity of stuttering behavior and not so much on the frequency of stuttering” (Section 6.1.1 in Alm, 2004), Thus, basal ganglia may be responsible for overt stuttering behaviors and their severity without being the underlying trigger (read more).
The crucial thing is: humans can actually have imprecise knowledge, imprecise thinking, vague ideas and expectations – computers can only simulate this. The results of such simulation, however, are not vague, but are always specific, even if they are wrong. Suchlike wrong but specific predictions are the basis of incorrect error signals in the theories mentioned above. By contrast, humans will hardly produce specific wrong predictions of how a word of their native tongue should sound like.
The error in Alm’s very fruitful theoretical approach, in my view, is that he regards speaking as a completely voluntary, internally cued behavior controlled by the medial premotor system. He underestimates the role of the lateral premotor system and of sensory feedback in the control of fluent speech. And just this is impaired in stuttering.
A dominance of the medial premotor system in stuttered speech is suggested by overactivation of SMA and striatum (Braun et al., 1997; Fox et al., 1996; Ingham et al., 2003) and substantia nigra (Watkins et al., 2008; Wu et al., 1997; but see also Giraud et al., 2008) and by reduced, i.e., normalized striatal activity after treatment of stuttering (Giraud et al., 2008; Ingham et al., 2013).
By contrast, left BA44 that can be regarded a part of the lateral premotor system in speech control was found to be lower activated during speaking or humming in stutterers (Neef et al. 2016); its functional connectivity with left BA6 (lateral premotor area) and left pSTG (Wernicke’s area) was reduced in stutterers when speaking or producing non-speech oral motor sounds (Chang et al. 2011), and intrinsic resting-state functional connectivity is reduced in left BA 44 (Lu et al. 2012). Further, the posterior temporal region (Wernicke’s area) that provides the lateral premotor system with sensory input, was often found to be under-activated during stuttered speech (see Table 1).
All these findings suggest that not the medial, but the lateral premotor system is dysfunctional in developmental stuttering. The only part of the lateral premotor system that shows overactivation during stuttered speech is the cerebellum, which may result from error signals due to lacking sensory input (as was extensively discussed in the last section; see here).
Stuttering has often been compared with Parkinson’s disease. I think, that’s wrong: Parkinson’s results from a weak basal ganglia function in mostly old people, developmental stuttering is rather related to a strong or a bit too strong basal ganglia function in young children and hyperactive people. Haloperidol reduces basal ganglia activity as well as stuttering, but can evoke not only languidness, but also Parkinson’s-like symptoms (tremor and bradykinesia) as a side effect (Kurz et al, 1995). On the other hand, Juste and Andrade (2017) who had investigated the disfluent speech in patients with Parkinson’s, conclude that the change in their speech cannot be considered a stuttering disorder, because the percentage of stuttering-like disfluencies did not reach 3% of stuttered syllables on average in their study.
Neurogenic stuttering after a lesion in the BG seems to be another case. Here, disfluency may result from failing internal cues for syllable starts. In developmental stuttering, by contrast, overt symptoms just result from these cues, by which the BG try to start a motor program that has been blocked by the cerebellum. That’s why the struggle behavior in developmental stuttering, which typically lacks in neurogenic stuttering (Lundgren, Helm-Estabrooks, & Klein, 2010). Interestingly, acquired stuttering typically also lacks all the features that I ascribe to the fact that the speaker’s attention is drawn from the auditory channel at particular sentence positions, at particular words, or in particular communication situations (see Section 2.2 and 2.5) – stuttering behavior is constant across speech and speech tasks (Lundgren, Helm-Estabrooks, & Klein, 2010). An audio example of acquired neurogenic stuttering is available here.
Watson and Freeman (1997) discussed studies of regional cerebral blood flow (rCBF) and studies of speech motor performance in which acoustic laryngeal reaction time (LRT) was measured as a function of the complexity of the required response, e.g., a word or sentence. They wrote:
“The rationale for the LRT studies arose, in part, from Goldberg’s (1985) discussion of medial and lateral premotor systems. The medial system is hypothetically related to spontaneous, prepositional speech and has connections to cingulate cortex. The lateral system is hypothetically related to nonpropositional, repetitive speech or speech guided by auditory self monitoring. This system functions in a responsive mode to external stimuli and has connections to auditory association areas in temporal cortex classically related tc processing (Kent, 1984; Penfield & Roberts, 1976). The stimulus-dependent nature of the LRT task and manipulation of the linguistic complexity of the response should preferentially involve the lateral premotor system in Goldbergs model. […] Relations between resting rCBF anomalies and speech motor performance deficits are consistent with predictions of Goldberg’s (1985) model regarding defects in the lateral premotor system ” (p. 343f)