Recently, the proceedings of the 2018 congress in Hiroshima were published in the IFA website, and I was delighted reading the report by Kristin M. Pelczarski and Linda Hoag: “Resonant voice as a potential fluency technique: a mixed-methods analysis” . The researchers explored the effectiveness of using resonant voice therapy as a technique to reduce stuttering. Over six weeks, three adult stutterers trained speaking in “an easy, but strong, clear voice that can be heard over a distance as well as in background noise”. After the training, the participants stuttered less frequently and reported a perceived reduction in frequency, tension, and duration of stuttered speech.
Resonant voice technique – speaking in a voice “easy and vibration-filled” – has normally been used to prevent or reverse a vocal fold injury. The idea behind the application of resonant voice technique in the therapy of stuttering was the assumption that reducing laryngeal tension may reduce or eliminate stuttering. However, this, by implication, would mean that laryngeal tension caused stuttering or at least contributed to the causation of stuttering – but is this the case?
A stutter block is often initiated by glottal closure, namely if it occurs at the onset of a word or syllable beginning with a vowel. I know this very well from my own experience, it was my typical pattern of stuttering in childhood. To prevent these blocks, I was taught speaking an /h/ (as unobtrusively as possible) prior to the vowel. So I said “(h)auto” instead of “auto” (car), and it helped me – some years, until blocks occurred at /h/-onsets. These blocks were initiated by a sudden inhibition of exhalation. I cleanly felt that my glottis was open, my vocal folds were relaxed, I was ready to produce the /h/ – but my abdominal muscles seemed to struggle with my diaphragm which contracted. I then learned to avoid these blocks at /h/-onsets by saying “(a)haus” instead of “haus” (house). I did so as unobtrusively as possible by pretending a little cough such that I reached a glottal slop at syllable onset.
The example suggests that glottal closure is only one of several ways the brain uses to interrupt speech flow against the speaker’s will. Further ways are contraction of the diaphragm during exhalation, or, also in case of prolongations, pressing the tongue against teeth or palate, or pressing the lips against teeth or against each other, according to the consonant concerned. Note that the glottis is always open during sound prolongation; phonation is not interrupted.
Stuttering is caused in the brain. Laryngeal tension, if it at all plays a role in stuttering, might be a reaction to a block on a vowel, an attempt to overcome the block by pressing out the vowel. Therefore, I think that a resonant voice reduces stuttering not by reducing laryngeal tension, but in another way: it makes stutterers listening to the new, unaccustomed sound of their voice. That is, speaking in a resonant voice is a fluency-enhancing condition operating in a similar way as other suchlike conditions do, e.g., speaking in an altered voice or foreign dialect, or as speaking under DAF or FAF (see Section3.1 in the main text).
I think that all fluency-enhancing conditions cause a re-allocation of the speaker’s attention, drawing attention to the auditory feedback of speech. This improves the processing of auditory feedback and its integration in speech control, which reduces stuttering. The idea (and experience) that both, speaking in a resonant voice and listening to one’s own voice during speech can reduce stuttering is not new. I will take this opportunity to mention three men who used these methods in the past, and who might either be unknown or probably buried in oblivion in the English language area. It is about Rudolf Denhardt, Oskar Hausdörfer, and Ronald Muirden.
Rudolf Denhardt (1845-1908) studied psychology and neurology. He as well as his father and his two brothers stuttered. Denhardt assumed that stuttering was not physiologically, but psychologically caused. He established a sanatorium (Heilanstalt) for stuttering that became world-famous because of its successful treatment. In 1878, Denhardt reported that, among others, 870 stuttering foreigners from 17 countries visited his sanatorium, among them 335 from Russia, 193 from Sweden, 80 from Switzerland, 57 from Denmark, even 6 from India, and 5 from Australia.
The sanatorium existed in Eisenach (the town in Thuringia where J. S. Bach was born) until 1954; after Rudolf Denhardt’s death, it was carried on by his son and least by his son-in-law. Denhardt started out with the question why stuttering does not occur during singing. His treatment methods were undisclosed, but clients later reported that pronouncing vowels loudly and sonorously was a central element in his therapy (cf. a report by Willi Becker in “Der Kieselstein”, the journal of the German stuttering self-help organization, volume 2, issue 10, 1980. Willi Becker had been a client in Eisenach in 1943).
Oskar Hausdörfer (1864-1951) was a German pharmacist. He reported that he severely stuttered in childhood and as a young man. After several unsuccessful therapies, so he reports in his book, he found he was much more fluent when speaking in a “sounding” voice, and when he actively listened to his voice while speaking. After this successful self-help, Hausdörfer started working as a speech therapist in 1895. His “Sprechlehranstalt” (speech academy) in Breslau existed until 1942.
Hausdörfer wrote a book about his experiences as a stutterer and about his treatment method, titled “Durch Nacht zum Licht. Des Stotterers bester Freund” (Through night to light. Stutterer’s best friend), first published 1921, last edition1933. It is not a scientific book and is written in a fairly overblown and mawkish stile. It contains a fictive story about a young man severely suffering from stuttering who visits a famous “old scholar” in hope for help. I try to translate a short passage of this story:
[the old scholar:] “Now let us again assess what you have to learn. When you want to speak, that is, when you have thoughts you want to make audible, then you should focus on the sound only, only listen and stay in your throat*; however, the latter happens automatically when we are listening = sounding; even the sound comes automatically when you only listen to your speech.” [the young man:] “That is, I should direct all my attention to listening only?” [the scholar:] “Nothing else, not at all.”
*) Hausdörfer probably means: do not think of letters! He believed that stuttering arose because one tries to speak letters instead of sound, that is, something visible instead of something audible See the original German text
Hausdörfer emphasizes two things: “sounding” (speaking in a sonorous voice, focus on vowels) and listening to one’s own voice, and at least in the above text passage, he says that listening is the crucial thing. This position is remarkable as it was (and probably is) contrary to the main stream. As far as I know, all experts were convinced that attention to auditory feedback is anyway detrimental for stutterers, if not even the cause of the disorder.
Ronald Muirden (1898-1981) wrote thrillers and westerns for a living before he started with lectures and evening classes about stuttering and its treatment in London around 1960. Like Hausdörfer, also Muirden overwent his severe stuttering by means of his voice. He concluded that the ultimate cause of stuttering was an “imperfectly produced voice”. He writes: “The natural voice should be freely forthcoming. It should also be virtually effortless, depending upon resonance for its largeness” (Muirden, 1996, p. 22). “When there is the intention to speak in our natural resonant voice, all the parts concerned adjust themselves automatically […] to produce the intended result...” (ibidem, p. 23)
Muirden discusses some conditions in which stuttering usually disappears, among them declaiming. “It is generally agreed”, so he writes, “that any stammerer can declaim without stammering” (ibidem, p. 32). I don’t know whether that is in fact generally agreed, but it meets my own experience: As a child, I declaimed well-memorized poems in front of an audience without any difficulty; I have always been sure not to stutter in this situation. Muirden explains the fluency in declaiming by the quality of voice production: In declaiming in front of an audience, the speaker‘s voice “must be not merely loud, but […] it must also have a ‘carrying’ quality – the quality of vibrating resonance” (ibidem, p. 33).
Muirden emphasizes the importance of resonance: “Voice is not produced by the vocal cords […]. The vocal cords produce only sound waves. The responsibility for producing lies in the second of the two stages of the mechanism, the amplification, where, by a process of resonation, the sound waves are amplified and given any desired vocal quality. The stammerer should fully appreciate this arrangement so that he might see that in the amplification he has the opportunity of taking control of his voice and managing it to the greatest advantage.” (ibidem, p. 35)
However, Muirden’s theory that an imperfectly produced voice is the ultimate cause of stuttering is not convincing as there are many people speaking fluently in a poorly produced voice. I think it is not the production of a resonant voice which reduces stuttering, but the fact that the deliberate production of a resonant voice is monitored via auditory feedback: The speaker listens to the new, unaccustomed sound of his/her voice. This improves the processing of auditory feedback and its integration in the control of speech (see Section 3.3 in the main text).
The passage from “Durch Nacht zum Licht” in German language.
„Nun wollen wir nochmals feststellen, was Sie zu lernen haben. Wenn Sie sprechen wollen, also Gedanken haben, die Sie hörbar machen wollen, dann sich nur um den Ton. bekümmern, nur hören, und in der Kehle bleiben; letzteres geschieht aber von selbst, wenn wir hören = tönen; aber auch das Tönen entsteht von selbst, wenn Sie nur aufs Sprechen hören." „Demnach hätte ich all meine Aufmerksamkeit nur aufs Hören zu richten?" „Nichts weiter — aber auch gar nichts. (p. 33) (return)
In Section 2.3. (about other stuttering theories) in the main text, I briefly describe the theory proposed by Gregory Hickok, John Houde, and Feng Rong (2011). Recently I read this paper again. The authors explain stuttering in the framework of their State Feedback Control (SFC) model. This model of speech motor control is worth being discussed more extensively since it has been influential in stuttering research. But first let’s have a look at their stuttering theory:
Hickok, Houde, and Rong (2011) propose that stuttering is caused by invalid error signals resulting from inaccurate, “noisy” mapping between internal forward models of the vocal tract and the sensory system. “This results in a sensory-to-motor ‘error’ correction signal, which itself is noisy and inaccurate. In this way, the system ends up in an inaccurate, iterative predict-correct loop that results in stuttering.” (p. 13).
In choral speech, so the authors assume, “the sensory system (which is coding the inaccurate prediction) is bombarded with external acoustic input that matches the sensory target and thus washes out and overrides the inaccurate prediction allowing for fluent speech” (ibidem). In other words, stutterers are fluent during choral speech because they hear the words they are speaking from their co-speaker(s) at the same time.
However, this hypothesis can be true at most for ‘classical’ choral speech, e.g., when the stutterer and a co-speaker in unison read the same text passage. But the choral effect was observed even when the other speaker read different material than that read by the stutterer (Barber, 1939; Bloodstein, 1950; Cherry & Sayers, 1956) and when the co-speaker changed without warning to speaking complete gibberish (Cherry & Sayers, 1956). A fluency-enhancing effect was even present when tape-recorded speech played backward was presented (Cherry & Sayers, 1956; Rami & Diederich, 2005) or a continuous vowel sound like /a/ (Davalu et al., 2011).
Apparently, a fluency-enhancing effect is present when the stutterer’s sensory system is “bombarded” with any external acoustic input including noise. Hence, this exclusive account only for the fluency-enhancing effect of choral speech is not very convincing. Instead, we should search for an unifying explanation applicable to preferably all conditions that immediately, but only transiently reduce stuttering. When we know why stuttering does not (or less frequently) occur in such conditions, then we perhaps also know why it even (or more frequently) occurs in normal conditions.
Thus, Hickok, Houde, and Rong’s account for the fluency-inducing effect of choral speech does not appear convincing to me. But what about the framework wherein this account was developed: the State Feedback Control model? Why is it relevant in the context of stuttering? Because it includes an internal feedback mechanism which is thought to play an important role in speech motor control.
In Section1.3 of the main text and in the blog post from March 5, 2019, I argue against the idea that a pre-articulatory self-monitoring via an internal feedback mechanism takes place during instantaneous speech under normal conditions in which the speaker hears his/her own voice. Levelt (1995) as well as Hickok, Houde, and Rong (1011) and Hickok (2012) assume an internal feedback provided in the auditory sensory modality such that it can be processed by the speech comprehension system in much the same way as external auditory feedback.
The question of whether or not such an internal feedback mechanism is active during overt speech is crucial for each theory that ascribes stuttering to a deficit in auditory feedback: If an internal feedback was active in addition to external auditory feedback, then deficits in the latter could be compensated for by the former, and there would be no reason for stuttering to occur. Then, each theory ascribing stuttering to an auditory feedback deficit would be implausible. It is therefore important to reach maximum clarity in this matter.
So let’s more closely look at the State Feedback Control model: The authors suppose that external sensory feedback is always delayed and thus insufficient for speech motor control, particularly for timely detection and online correction of errors. Therefore, they propose an additional internal mechanism that continuously feeds the control system with information about the state of the vocal tract and about the sensory, i.e., also the auditory consequences of motor commands. These internal auditory feedback is earlier in the speech comprehension system than external auditory feedback, such that an errors (a mismatch between feedback and target representation) can earlier be detected and corrected.
The State Feedback Control model implies that the speaker (or his/her brain) knows how his/her speech sounds like before he or she receives information about that via external hearing. However, experiments (e.g., Cai et al., 2012; Loucks, Chon, & Han, 2012; Tourville, Reilly, & Guenther, 2008) have shown that speakers compensate for manipulations in pitch of their external auditory feedback. Why should they do so when they (or their brains) already knew how their voice sounds like and that their pitch is quite normal in truth?
This question arises all the more when people participate in an experiment in a speech lab and receive external feedback through headphones, i.e., when they are probably aware of a possible manipulation of their external feedback: Why do they still follow the external feedback instead of following their internal predictions that tell them the true sensory consequences of their voicing?
A similar problem is posed by the slowed and disfluent speaking of adult healthy individuals under delayed auditory feedback (Fairbanks & Guttman, 1958; Lee, 1950, 1951; Venkatagiri, 1980): Although undoubtedly aware of the fact that their external auditory feedback is incorrect, they (or their brains) seem to be unable to ignore it and instead to follow their internal, non-delayed sensory predictions. This is astonishing all the more as this internal mechanism was proposed by the authors in order to solve the “engineering problem” with the normal delay of external feedback (Hickok, 2012).
The responses to manipulated external auditory feedback strongly suggest that speakers (and their brains) have no other information about the sensory consequences of their speech movements than that provided by external auditory feedback. Most likely, internal predictions of the auditory consequences of motor commands are not available and play no role in the control of overt speech as long as external auditory feedback is present.
That doesn’t mean that no internal feedback mechanism exists. It does exist. It is the basis of inner speech, and it allows us to internally ‘hear’ and monitor our overt speech when no external auditory feedback is available, e.g., in mouthing (also referred to as lipped or pantomime speech) and when one’s own voice is completely masked by noise – see Section 3.1 about fluency-enhancing conditions.
The important thing for a theory ascribing stuttering to deficits in (external) auditory feedback, however, is: External auditory feedback and internal feedback (in the auditory sensory modality) do not work concurrently. We can hear our speech internally only when we don’t hear it externally.
Finally it shouldn’t go unmentioned that internal sensory predictions play an important role at least in one area of motor control, namely in the control of eye movements. As Hickok (2012) states, “the state feedback control approach has been highly influential and widely accepted within the visuomotor domain” (p. 3).
When we move our eyes, the image of the world around us moves in the opposite direction across the retina, but we don’t perceive this as a real movement of the environment. The sensory consequences of eye movements are continuously predicted, probably by means of an internal forward model, and our conscious visual perception is corrected by means of these predictions. So we are able to distinguish between the virtual movement of the environment due to our eye movements and actual movements of people, animals, or things around us.
However, such a correction mechanism is not necessary during speech. The crucial difference to eye movements is: When we move our eyes, we move our most important sensory organ. Eye movements serve for a change of perception, for changing the part that is perceived out of the environment. This is not the case with speech movements. Also the distinction between one’s own speech and that produced by others is easy because of the somatosensory feedback of one’s own articulatory movements