Abstract

How effortlessly and quickly infants acquire their native language remains one of the most intriguing questions of human development. Our study extends this question into the audiovisual domain, taking into consideration visual speech cues, which were recently shown to have more importance for young infants than previously anticipated [Weikum WM, Vouloumanos A, Navarra J, Soto-Faraco, S, Sebastian-Galles N, Werker JF (2007) Science 316:1159]. A particularly interesting phenomenon of audiovisual speech perception is the McGurk effect [McGurk H, MacDonald J (1976) Nature 264:746-748], an illusory speech percept resulting from integration of incongruent auditory and visual speech cues. For some phonemes, the human brain does not detect the mismatch between conflicting auditory and visual cues but automatically assimilates them into the closest legal phoneme, sometimes different from both auditory and visual ones. Measuring event-related brain potentials in 5-month-old infants, we demonstrate differential brain responses when conflicting auditory and visual speech cues can be integrated and when they cannot be fused into a single percept. This finding reveals a surprisingly early ability to perceive speech cross-modally and highlights the role of visual speech experience during early postnatal development in learning of the phonemes and phonotactics of the native language.