Mouth-in-the-door: The effect of a sound image of an avatar intruding on personal space that deviates in position from the visual image

・We examine the audiovisual experience in virtual reality (VR) service context that enables a more effective interaction between a user immersed in a virtual environment (VE) and an avatar as a store staff.
・The phenomenon showed in this study is similar to the “foot-in-the-door” phenomenon in which small unconscious consent (i.e., allowing a sound image to intrude on one’s personal space) leads to an improvement in the evaluation of the other person (i.e., the rapport with another person).
・The techniques proposed in this paper, such as the positional difference between the sound and visual images, significantly improve the value of the service experiences obtained through interaction with others in VE.


We conducted an experiment to investigate how the positional deviation between the sound and visual images can be tolerated in VE, the effect of positional deviation on the interpersonal distance to the avatar, and the possibility of manipulating the impression of the avatar by deviating the sound image from the visual image. For the experiment, we prepared a space resembling a VE store and conducted proximity experiments with 16 gender-balanced participants and six types of avatars. By utilizing the superiority of visual information over auditory information revealed in the experiments, we constructed an interpersonal situation with an avatar playing the role of store staff in which only the sound image intruded into the user’s personal space, and we investigated users’ impressions of the avatar. We also investigated users’ impressions of the avatar. We found the following two phenomena in the experimental conditions where the positional difference was allowed: 1) Even when the positional difference was allowed, it caused an “uncanny valley”-like phenomenon that led to a decrease in rapport; and 2) In the conditions where the positional difference was allowed when the sound image was closer than the visual image to the participant, the rapport was greater with the avatar playing the role of the store staff.


1) When the face-to-face avatar existed both sonically and visually, the information obtained from the image was the dominant factor that determined the interpersonal distance.
2) There is individual variation, but even if the position where the voice originates is slightly deviated from the VR avatar, the deviation is ignored due to the ventriloquism effect.
3) By bringing the position where the voice originates closer to the range where the ventriloquism effect works, invasion into personal space also becomes possible, and impressions of the VR avatar also improve. (This phenomenon is termed “mouse in the door”).
4) Contradicting the favorable impressions seen above, the deviation between the position where the voice originates and the visible position of the VR avatar also unconsciously causes lowered impressions, and conditions exist (position of approximately 75% of the distance to the avatar in the frontal direction) where the adverse effects outweigh the beneficial effects.

Market Application

By incorporating the presentation of personalized sound image locations proposed in this study into existing VR stores and on VR store platforms that are expected to increase in number in the future, the comfort level of the service experience will be improved for customers. As a result, these stores can expect to increase their brand value and sales.


This entry was posted in Research Highlights. Bookmark the permalink.