Today I want to discuss some aspects of a doctoral dissertation I found in the web some time ago: “Inhibitionof stuttering from second speech signals: an evaluation of temporaland hierarchical aspects.” by Daniel J. Hudock (2012).
The studies reported in the dissertation examined, among others, the effect of shadowing on the speech of people who stutter. Shadow speech “is historically defined as the person who stutters lagging or shadowing behind a fluent speaker's utterance. Reductions under shadow speech typically range from 80-90%. “ (Hudock, 2012, Abstract).
First time, the effect of ‘inverse shadowing’ was investigated in this study: The person who stutters did not maintain the lag speaker position (as usual in shadowing), but was maintaining the lead speaker position. This condition is, in a manner, similar to delayed auditory feedback, but produced by a second speaker. Surprisingly, not only normal, but also inverse shadow speech reduced stuttering frequency approximately 80%.
What can we learn from this result? First, it clearly refutes the hypothesis that shadowing induces fluency by providing a correct pattern of speech in terms of rhythm, articulation, prosody, or what ever (this hypothesis was derived from the theory that stuttering is a learned incorrect manner of speaking). Inverse shadowing cannot provide any pattern of speaking, but it still reduces stuttering as well as normal shadowing.
It is further clear that shadow speech does not provide cues for syllable starts, as it has been assumed for choral speech and for speech paced by the beat of a metronome. By the way, also this assumption is wrong: If you each time wait for an external cue and then respond to it, you will never be in sync with the metronome or the co-speakers, but always too late because of the brain’s reaction time. Instead, you must capture the given pace or speech rate such that you can anticipate it. Then, you must adjust your own rhythm or rate and continuously monitor whether it is still in sync. Daniel Hudock points to the fact that, in choral reading, “two speakers may frequently alter speaker positions by speeding up, slowing down or emphasizing different word and sentence components in a constant dynamic fluctuation” (p. 89), which refutes the ‘cue hypothesis’ at least for choral speech.
Back to shadow speech: How does it reduce stuttering? Daniel Hudock assumes that the activity of mirror neurons accounts for the similar reduction of stuttering during lead and lag speaker conditions in shadowing: “by perceiving second speech signals people who stutter immediately engage their mirror neuron systems, therefore inhibiting stuttering.” (p. 92). However, are mirror neurons not usually understood as a system that responses to an external sensory input, e.g., with a spontaneous imitation of a perceived behavior?
For example, Kalinowski and Saltuklaroglu (2003a) proposed “that the choral speech effect is a form direct imitation, a primitive and innate human capacity that is possibly mediated at the neuronal level by ‘mirror neurons’.[...] The engagement of these systems allows gestural sequences, including speech, to be fluently replicated. Choral speech and its permutations use the capacity for fluent imitation in people who stutter via a 'loose' gestural matching system in which gestures in the external signal possessing cues found in the intended utterance can serve as stuttering inhibitors.” (339)
Imitation however presupposes that the stutterer is maintaining the lag speaker position, which is the case in normal shadowing, but not (at least not always) in choral speech, and not in inverse shadowing. The activity of mirror neurons may therefore provide an explanation for the effect of normal shadowing, but not for the effect of choral speech and inverse shadowing. The hypothesis does further not account for fluency-inducing conditions in which no second speech signal is provided: speaking paced by a metronome, mouthing (pantomime speech), auditory masking.
By the way, Kalinowski and Saltuklaroglu (2003b) equal ‘choral speech’ and ‘unison speech’ with ‘imitation speech’, and this (from my view) incorrect equation seems to be the basis for their theory that the engagement of mirror neurons induces fluency in choral speech and in its derivatives like delayed or frequency-altered auditory feedback. The fact that inverse shadowing has approximately the same effect as normal shadowing should be taken as a suggestion that even normal shadowing does not reduce stuttering by engaging mirror neurons.
In the main text, Section 3.5, I have explained the effect of normal shadowing in the framework of the AAT: The lag speaker is required to listen not only to the lead speaker, but also to his own speech in order to monitor whether he exactly follows. The lag speaker’s attention is drawn to the auditory channel and to auditory feedback, which improves the processing of auditory feedback (the core idea of the AAT is that stuttering results from invalid error signals due to insufficient processing of sensory, mainly auditory feedback, caused by a misallocation of attention during speech).
However, what is when the stutterer is not the lag, but the lead speaker in shadowing? Daniel Hudock points to the fact that this condition mimics delayed auditory feedback (DAF). In the framework of the AAT, I explain the DAF effect (and the effect of some other kinds of altered auditory feedback) in the following way: Altered auditory feedback sounds unfamiliar and odd, therefore it draws the speaker’s attention to the auditory channel, which improves the processing of auditory feedback.
This explanation is, particularly for DAF, supported by the results obtained by Foundas et al. (2013) and by Unger, Glück, and Cholewa(2012), who examined the effect of electronic devices which reduce stuttering by altered auditory feedback. In both studies, speech fluency was found to be significantly improved by the devices even in a control condition without DAF and without alteration of frequency (FAF): It might have been somewhat unfamiliar to hear one’s own voice not in the natural way, but through the device, and this drew the participants’ attention to the auditory channel. DAF and FAF seem to only increase this effect by making the feedback even more unfamiliar.
Can we therefore interpret the second speech signal in inverse shadowing as similar to an unfamiliar kind of auditory feedback – not only delayed, but also frequency-altered, as the lag speaker’s voice differs from the lead speaker’s voice – which draws the lead speaker’s attention to the auditory channel? I think we can. Interestingly, Daniel Hudock writes that “many participants in the current study self-reported how difficult it was to maintain their own speech productions while not being influenced by the second speaker.” (p. 90). What did they do to not being influenced? They perhaps focused on their own voice.
Therefore, I think the AAT is consistent with the finding that inverse shadowing reduces stuttering in approximately the same extent as normal shadowing does. In both, normal and inverse shadowing, the stutterer’s attention is drawn to the auditory channel, and he is required to listen to his own voice and speech. This improves the processing of auditory feedback and, by that, prevents invalid error signals in the monitoring system and resulting interruptions of speech flow.
Finally, I want to briefly discuss another paper by Daniel Hudock and colleagues from 2011: “Stutteringinhibition via visual feedback at normal and fast speech rates.” The main finding of this study is that visual feedback of speech movements – participants viewing the lower portion of their face on a monitor – produced reductions in stuttering frequency ranging from 27% (without delay of feedback) to 62% (400ms feedback delay). Importantly, there was no significant main effect of speech rate or the interaction between speech rate and visual speech feedback; thus the reduction of stuttering cannot be explained as a result of slowed speech due to delayed feedback.
The AAT claims that stuttering results from insufficient processing of auditory feedback and/or of the sensory feedback of breathing – is this consistent with the finding that visual speech feedback reduces stuttering in such a degree?
First, we should consider that visual feedback cannot play any role in the control and self-monitoring of speech, simply because we usually get no visual speech feedback. It is therefore unlikely that visual feedback directly influenced the control of speech in the experiment. The effect may rather be indirect: The speaker’s attention is drawn to external perception in general and away from internal processes like speech planning, somatosensory feedback of articulation, or voluntary control of speech. This change in the allocation of attention may improve also the processing of auditory feedback.
Background of this hypothesis is the study conducted by Chang et al. (2018) who found an aberrant functional connectivity between and within intrinsic connectivity networks in the brain in children who stutter, among them between default mode network and dorsal and ventral attention network. They even found a reduced functional connectivity in the visual network, suggesting that external sensory information in general is not involved in the control of behavior in children who stutter in the same extend as in normal fluent children. The cause may be an imbalance in the attention system (which is suggested also by results of behavioral studies). Such an imbalance can be temporary corrected by powerful, e.g., unfamiliar external stimuli like visual speech feedback.