Temporal contingency and prosodic modulation of feedback in human-computer interaction: Effects on brain activation and performance in cognitive tasks
Susann Wolff, Christin Kohrs, Henning Scheich, André Brechmann
Companion-Systeme und Mensch-Companion-Interaktion at INFORMATIK 2011 - Informatik schafft Communities
Berlin 2011
Berlin 2011
Abstract: 1 Influence of motivational-prosodic feedback
1.1 Introduction
In an environment in which a human user receives assistance from a technical system, feedback given in response to the user's input plays an important role. In such a learning context, we need to differentiate between positive feedback for correct responses and negative feedback for incorrect responses given by the user. If the feedback is presented verbally, one further needs to differentiate between naturally spoken utterances which are pre-recorded from a human speaker and then played back by the system on the one hand, and computer-synthesized speech on the other hand. Finally, it is important to consider the prosody (i.e. intonation) of the utterance, for example whether a positive feedback is spoken neutrally or with a praising prosody. The experiment described below focused on spoken feedback and the differential effects of different prosodies on the users’ performance in a learning task. To this avail, we compared computer-synthesized feedback with neutrally as well as motivationally spoken natural feedback.
1.2 Methods
The experiment was conducted with 64 participants (32 female; aged 19-35), whose task it was to learn to discriminate between different types of frequency-modulated tones (total: 240 tonal stimuli) via button-press. After each button-press, spoken feedback about the correctness of the response was presented. Since participants were not informed about the relevant stimulus properties prior to the experiment, they were required to make use of the received feedback to learn the discrimination correctly. The feedback stimuli consisted of four positive and four negative feedbacks as well as one time-out feedback, all spoken in German (e.g. positive: richtig, ,,right"; negative: falsch, ,,wrong"). Each participant received these feedbacks in one of four different types of prosody, depending on experimental group. Group NEUT received neutrally spoken feedback, Group MOTI received motivationally spoken feedback (with a praising prosody for positive feedback and a blaming prosody for negative feedback), Groups SYNTH-F and SYNTH-M received computer-synthesized feedback.
The neutral and the motivational feedback, were spoken by a female German professional speaker. The recordings were evaluated by 24 naïve German subjects, resulting in recognition rates of 78%, 97%, and 94% for neutral, praising, and blaming prosody, respectively. The synthetic feedbacks were computer-generated with the MARY Text-to-Speech Synthesizer 3.6.0, employing a female and a male neutral voice profile, respectively. Participants' performance in the task was analyzed by means of a mixed-design analysis of variance, crossing the 4-level between-subjects factor PROSODY (NEUT vs. MOTI vs. SYNTH-F vs. SYNTH-M) and the 6-level within-subjects factor TIME (Block 1 to 6, each block containing 40 experimental trials). In case of significant effects of PROSODY, three orthogonal contrasts were calculated: First, to examine a possible advantage of naturally spoken feedback over computer-generated feedback, the two natural prosodies were compared to the two synthesized prosodies (NEUT/MOTI vs. SYNTH-F/M). Second, to elucidate whether, in naturally spoken feedback, motivational prosody offers an additional benefit in comparison to neutral prosody, these two prosodies were compared with each other (NEUT vs. MOTI). Third, to check for possible differences between male and female voice profiles in computer-generated speech, these two prosodies were compared as well (SYNTH-F vs. SYNTH-M).
1.3 Results
The ANOVA revealed a significant main effect of TIME (p $<$ .001), stemming from a substantial increase in performance due to the participants' learning achievement in the course of the experiment. It furthermore showed a significant main effect of PROSODY (p $<$ .03) as well as an interaction PROSODY * TIME (p $<$ .005) based on a significant effect of prosody in Blocks 1 to 3 (p $<$ .04). When considering the follow-up contrasts, the naturally spoken feedbacks (NEUT/MOTI) engendered a higher performance than the computer-synthesized feedbacks (SYNTH-M/F) in all three blocks. In Block 2, there was an additional performance benefit for participants receiving motivational compared with neutral feedback over (p $<$ .03). The two different types of synthesized feedback did not differ significantly in any of the blocks.
1.4 Discussion
The reported results demonstrate clearly that prosody is an important factor influencing user performance in a learning task when verbal feedback is presented auditorily. More specifically, pre-recorded human language leads to a steeper learning curve than computer-synthesized speech. Furthermore, there seems to be an additional beneficial effect if the recorded feedback is not spoken neutrally but includes a motivational prosody (i.e. sounds praising or blaming). The fact that the performance differences between the experimental groups only appeared in the first three blocks results from a ceiling effect in the second half of the experiment, when also participants receiving synthesized feedback had eventually learned the discrimination successfully. Thus, the prosodic manipulation of feedback given by a technical system does not seem to affect the eventual performance level achieved by a user, but natural and motivational prosodic feedback can improve the speed with which the learning goal is accomplished. Future fMRI experiments will serve to examine which brain areas are specifically involved in this beneficial effect.
2 Timing of feedback
2.1 Introduction
Feedback, as the basis for communication, serves to fulfill the need for closure, the subjective sense of completion. When communicating with a machine, it is important to receive immediate feedback about the registration of an action (e.g. a button press). If this is not the case, such a human-computer dialog may fail or is at least annoying for the user, because temporal expectancies might be disturbed [1]. During a simple repetitive task users expect a response within 200 ms [2]. However, there is large inter-individual variation in acceptable waiting time, depending on many factors, such as personality, cost, age, mood, cultural context, time of day, or noise (Shneiderman and Plaisant, 2005).
In an fMRI experiment, we investigated the effects of delayed and omitted feedback on brain activation in comparison to immediate feedback.
Based on these results of the fMRI study we conducted a behavioral experiment determining the threshold for noticing a delayed feedback both in young and older participants.
2.2 Methods
In a current fMRI study 13 subjects (mean age: 27) had to perform an auditory categorization task. Linearly frequency modulated (FM) tones with a duration of 600ms served as acoustic stimuli. By pressing a button participants had to categorize these FM tones according to the direction of modulation and received an immediate visual feedback in form of a green checkmark in 76% of the cases. In 12% of all trials the feedback was delayed by 500 ms and in another 12% the feedback was omitted. The feedback just indicated that the participant answered fast enough. If they answered to slow they received a red cross. Not until after the experiment participants were informed that the temporally trustless feedback is part of the experiment. Based on our suggestions to find stronger activations during the omission of feedback compared to the moderate delayed feedback in brain areas processing prediction errors, and the reversed effect in the reward system, we conducted the direct contrast of delayed and omitted feedback. Furthermore, we presumed that both unexpected conditions show differential activity compared to immediate feedback. Therefore, the balanced contrast of both unexpected conditions and immediate feedback was computed. From all resulting clusters, volumes-of-interest (VOIs) were defined.
In a subsequent behavioral study, 10 elderly (mean age: 66) and 14 young subjects (mean age: 28) had to indicate whether they noticed a delayed auditory feedback ("okay") in a block of ten trials, while categorizing FM tones like in the fMRI experiment. Starting with a delay of 500ms, the delay was adjusted in steps of 25ms to determine the threshold of a just noticeable delay for each participant.
2.3 Results
Contrary to our first hypothesis the direct contrast between delayed and omitted feedback revealed only regions that were more strongly activated during delayed compared with omitted feedback. In accordance with our second hypothesis, the direct contrast revealed significant effects in the reward system during delayed feedback. While we found no areas stronger activated during the omission of feedback compared to delayed feedback we conducted the balanced contrast between both unexpected conditions compared to immediate feedback. We found several regions stronger activated during delayed and omitted feedback. The largest differences were found in the posterior medial frontal cortex (pMFC), right dorsolateral prefrontal cortex (dlPFC), bilateral anterior insula/ inferior frontal gyrus (aI/GFi) and inferior parietal lobe (Lpi).
In the behavioral study, we found a significantly (p $<$ 0.01) higher threshold for the just noticeable delay in the older participants (299 ± 106 ms) compared to younger participants (184 ± 65.5 ms). Overall the threshold of a noticed delay was positively correlated with the number of mistakes in the auditory categorization task (r = .47 p $<$ 0.025) but also seems to be related to age (r = .36 p $<$ 0.08).
2.4 Discussion
The findings of the current study reveal a network of regions stronger activated during omitted and delayed feedback compared to immediate feedback. A decrease in activation was observed in the anterior and posterior cingulate cortex. The effects on activation in all these regions seem to be evoked by higher attentional demands and adjustments in action control [3, 4]. Unexpectedly, a short delay in feedback produced essentially the same pattern and degree of activation as the omission of feedback. These findings seem to be in line with the assumption of [1] that humans are able to habituate to a fixed delay, however, an unexpected delay will always be disruptive. This emphasizes the importance of immediate feedback in human-computer interaction.
The behavioral experiment revealed a significant difference between younger and older participants noticing a delay. This threshold seems to depend on the difficulty of the task but also on the age of the participant. It will be of interest to find out if different delays have different effects on brain activity, when comparing young and elderly subjects.
References
1. Shneiderman, B., Plaisant, C.: Quality of services. Designing the user interface - Strategies for effective human-computer interaction. Pearson Addison Wesley, Boston, San Francisco, New York, London (2005) 453-475
2. Miller, R.B.: Response time in man-computer conversational transactions. AFIPS Spring joint computer conference, Vol. 33, Montvale, NJ (1968) 267-277
3. Dosenbach, N.U., Visscher, K.M., Palmer, E.D., Miezin, F.M., Wenger, K.K., Kang, H.C., Burgund, E.D., Grimes, A.L., Schlaggar, B.L., Petersen, S.E.: A core system for the implementation of task sets. Neuron 50 (2006) 799-812
4. McKiernan, K.A., D'Angelo, B.R., Kaufman, J.N., Binder, J.R.: Interrupting the ,,stream of consciousness": an fMRI investigation. NeuroImage 29 (2006) 1185-1191