|Year : 2017 | Volume
| Issue : 2 | Page : 66-71
Categorical perception of pitch: Influence of language tone, linguistic meaning, and pitch contour
Saransh Jain1, Ananya Ajay2, Sharmada Kumaraswamy3
1 Department of Audiology, Jagadguru Sri Shivarathreeswara Institute of Speech and Hearing, University of Mysore, Mysore, Karnataka, India
2 Department of Speech and Hearing, Jagadguru Sri Shivarathreeswara Institute of Speech and Hearing, University of Mysore, Mysore, Karnataka, India
3 Department of Speech and Hearing, Faculty of Engineering and Environment, University of Southampton, Southampton SO17 1BJ, United Kingdom
|Date of Web Publication||18-Dec-2017|
Department of Audiology, Jagadguru Sri Shivarathreeswara Institute of Speech and Hearing, University of Mysore, M.G. Road, Mysore - 570 004, Karnataka
Source of Support: None, Conflict of Interest: None
Introduction: Pitch is important for perception of speech. It is an imperative acoustic cue for differentiating gender, age, emotion and culture, etc. In certain languages, pitch also changes the linguistic meaning of the word. Mandarin, Cantonese, Thai, etc. (tonal languages) are few such languages where the pitch contour varies the meaning of the word. Researchers reported that language tone and pitch contour influence the pitch perception, but the results were inconclusive. The role of linguistic meaning was also sparsely investigated in the context of pitch perception. Thus, the present study was designed to assess the influence of language tone, linguistic meaning, and pitch contour on the perception of pitch. Methods: Fifty adult Mandarin and Kannada speaking individuals were selected, and their pitch perception abilities were measured using a 15-step categorical perception paradigm. The stimuli were Mandarin meaningful and nonmeaningful syllables varying in their pitch contour from rising to falling fundamental frequency in one set and falling to rising fundamental frequency in another set of continuum. Results: Univariate ANOVA was used to compare the effect of language background, linguistic meaning, and pitch contour on the perception of pitch. Results indicate no significant effect of linguistic background (P > 0.05) and linguistic meaning (P > 0.05), but the mean values were significantly different across pitch contour (P = 0.001). Conclusion: The language tone and linguistic meaning have no significant influence on the pitch perception, but the categorical boundary was wider for Kannada language participants and for nonmeaningful stimuli. Pitch contour significantly affects the perception of pitch. There are differential perceptual processes which are dependent on the native languages.
Keywords: Acoustics, linguistic diversity, psychoacoustics, speech perception, tonal language
|How to cite this article:|
Jain S, Ajay A, Kumaraswamy S. Categorical perception of pitch: Influence of language tone, linguistic meaning, and pitch contour. J Indian Speech Language Hearing Assoc 2017;31:66-71
|How to cite this URL:|
Jain S, Ajay A, Kumaraswamy S. Categorical perception of pitch: Influence of language tone, linguistic meaning, and pitch contour. J Indian Speech Language Hearing Assoc [serial online] 2017 [cited 2022 May 19];31:66-71. Available from: https://www.jisha.org/text.asp?2017/31/2/66/220996
| Introduction|| |
Pitch is the psychological correlate of the fundamental frequency and is important for speech perception. The quality of the sound is dependent on the highness or lowness of tone pitch along with other acoustic cues. It is a major cue for the identification of suprasegmental aspects of speech; helpful in gender identification;, age identification;, emotional arousal; cultural variations; and sociocultural aspects. The change in pitch does not vary the meaning of spoken words or sentences in most of the Indo-European languages such as English, Hindi, and Sanskrit, but in Sino-Tibetan languages such as Mandarin, Cantonese, and Thai, the change in the pitch contours changes the meaning of the words. Such languages where the meaning of the words is dependent on the pitch contour within a syllable are known as tonal languages.
Perception of pitch is a topic of scientific enquiry. The contour tone of pitch is perceived categorically. The categorical perception indicates that the gradually morphed stimulus is perceived as discrete identities in the auditory system. Thus, the pitch perception is categorical when the continuum involves the change in the direction of the pitch contour. In the last decade, a surge of interest has been developed to assess the pitch contour contrasts of tonal languages.,,, Mandarin Chinese is one such tonal language involving contour tones which is perceived categorically. Researchers reported strong evidences of categorical perception for pitch contrasts in Mandarin, Thai, and Cantonese listeners. Xi et al. have found that categorical perception of lexical tones in Mandarin Chinese is perceived categorically.
In cross-linguistic research studies, the tone perception abilities by native Chinese and Americans revealed that the pitch variation is perceived categorically by tonal language listeners , but not by English language listeners. Similar effect was observed between listeners of Taiwanese and English language. Contrarily, some studies reported no such influence of linguistic experience on the categorical perception of pitch between listeners of Mandarin, Cantonese, and German languages. Other researchers also investigated the perception of pitch in listeners of tonal and nontonal language, and the overall results were inconclusive.,,,, It was noted that most studies used pitch contours for sounds which conveyed linguistic meaning. The pitch perception for continuum carrying nonmeaningful sounds was sparsely investigated, and hence, ruling out the role of linguistic interference on categorical perception was difficult, especially for tone language speakers. Researchers clearly indicated different perceptual processes for meaningful and nonmeaningful speech., Many researchers have investigated pitch perception using nonspeech stimuli,, but assessment using speech stimuli was limited. Thus, there was a need to investigate the categorical perception of pitch in speakers of tonal language for meaningful and nonmeaningful stimuli. Considering the need, the present study assessed the effect of linguistic background (language tone), linguistic meaning, and pitch contour for the perception of pitch.
| Methods|| |
In a standard group, participants from different linguistic background were compared for their pitch perception abilities. All participants were having normal hearing sensitivity (PTA <15 dB HL; SRT + 10 dB of PTA; SIS >90%) with no present or past complaint of psychological, neurological, or other associated pathology, at the time of testing. They were divided into two groups on the basis of their linguistic background. Group 1 involved native Mandarin (tonal language) listeners belonged to Beijing region of China. These listeners were available in Mysuru as a part of student exchange program of the university. Group 2 participants were native Kannada (nontonal language) listeners belonged to the Mysuru region of South Karnataka (India). The participants from similar origin were considered to rule out the dialectal variation in perception. [Table 1] is showing the detailed participant distribution. Prior approval from the institutional ethical board to test human subjects was obtained, and informed written consent was attained from all the participants.
Six monosyllables were selected as stimuli for the present study. The selection of stimuli was such that it should be present in the syllabary of both Mandarin and Kannada language. Since Mandarin is syllabic language, i.e., minimal meaningful unit of language in Mandarin is a monosyllable, three selected stimuli were meaningful for the listeners of Mandarin language. These three selected stimuli were \ta\, \la\, and \da\. It was further considered that the pitch contours of these stimuli changes the lexical meaning of the syllable in Mandarin. For example,/tā/ (flat contour) means “he/she;”/tă/ (falling-rising contour) means “tower;”/tà/ (falling contour) means “to investigate;” and/Tà/ (stressed “T”) is “a surname.” Complimentarily, syllables \to\, \lo\, and \do \were selected as they convey no meaning in Mandarin. These syllables were labeled as nonmeaningful syllables. The acoustic difference between the meaningful and nonmeaningful syllables was only with respect to the vowel (/a/to/o/). On the other hand, Kannada is word level language, i.e., minimum meaningful unit of language is a bisyllabic word in Kannada. Thus, the same syllables, i.e.,/ta/,/la/,/da/,/to/,/lo/, and/do/were selected as test stimuli. These syllables were nonmeaningful for the listeners of Kannada language.
Recording and manipulating the stimuli
A native female Mandarin speaker was asked to record each syllable in normal tone at a sampling frequency of 44,100 Hz. The fundamental frequency (F0) for each syllable was maintained at approximately 300 Hz. The duration was kept constant at 200 ms. Each syllable token was then synthesized and two set of continuums with varying F0 slope were constructed. In the first set, onset F0 was varied from 150 Hz to 300 Hz while keeping the offset constant, and in the subsequent continuum, offset F0 was varied from 300 Hz to 150 Hz while keeping the onset frequency as constant, in 15 equal steps of 10 Hz each. The onset and offset F0 values were taken by acoustically analyzing the Mandarin participant's speech while they were asked to utter the selected speech stimuli with a falling and rising contour (frequency normalized to nearest 10's value). The resultant was two 15-step continuum with rising to falling tone contour (rising F0 at one end point and falling F0 at other end point) in one continuum and falling to rising tone contour in another continuum, for each syllable. The slope of pitch tire was varied systematically between +0.75 and −0.75 (with 0 being considered as flat contour, i.e., 0° slope and 1 being considered as the maximum possible slope, i.e., 90° slope). The slope value was also considered by acoustically analyzing the speech samples of Mandarin participants. The detailed F0 distribution along each step is plotted in [Figure 1]. Other spectral and temporal features were kept constant throughout the continuum. The entire stimuli were constructed using Praat software (University of Amsterdam) where the pitch tire of the original sound was estimated, and points along the pitch contour were varied systematically in a precalculated manner and normalized at 70 dB output SPL. In the Praat software, pitch tire of the syllables was extracted using “to pitch (autocorrelation)” method available in the periodicity option of the analysis toolbox. The pitch tire was then edited manually to vary the slope of the pitch.
|Figure 1: The onset and offset F0 values for each step along rising to falling continuum (continuum 1) and falling to rising continuum (continuum 2)|
Click here to view
Each stimulus along the continuum was presented binaurally to the participants in random order using the personal computer in a quiet environment. Each stimulus was presented five times to ensure reliable responses. The output of the headphones (Sennheiser HD 205II) was monitored for 70 dB SPL (average most comfortable level) using sound level meter (B and K 2238, Mediator). The participants were instructed to label the sound in terms of either end point along the continuum (two alternative forced choice paradigm), i.e., either the sound is with rising continuum or the sound is with falling continuum or vice versa. For example, in rising to falling contour, the meaning of the word changed from one to another. For example, /la/ with rising contour means “to slash” and that with a falling contour means “solder.” Thus, in a rising to falling contour, the meaning of the syllable varied from “to slash” to “solder.” In a 2AFC stimulus paradigm, the listener has to indicate whether he/she heard as “to slash” or “solder.” The vice versa holds true for falling to rising contour. This holds true for meaningful stimuli only. For nonmeaningful stimuli, changing the pitch tie did not changed the meaning of the syllable. Practice trial was given before the commencement of the actual test. In the practice trial, the syllable at both end points was presented and participants were asked to identify the stimulus. The percentage correct response was measured for each step along both the continuums. The categorical boundary was measured with the help of logistic regression using Prism software (GraphPad Software Inc., version 5.03). The point where the participants identified the stimulus at one end point at least 50% of time and the point where stimulus was identified as belonging to other end point along the continuum at least 50% of time were calculated. The midpoint between these locations was defined as the categorical boundary, and the difference between these locations was defined as the width of the categorical boundary.
The F0 values at the categorical boundary were averaged for five trials across each participant, for each syllable. The mean F0 was categorized with respect to linguistic background (Mandarin and Kannada), linguistic meaning (meaningful and nonmeaningful), and pitch contour (rising to falling and falling to rising). The significance of differences was measured using univariate analysis of variance, where the responses were considered as dependent variables, and linguistic background, linguistic meaning, and pitch contour were fixed factors. Comparisons within pitch contours were made using Bonferroni's post hoc test.
| Results|| |
The mean F0 at the categorical boundary was represented as a function of linguistic background, linguistic meaning, and pitch contour in [Figure 2]. As evident, the mean F0 at the categorical boundary was slightly different for the Mandarin and Kannada group. Results indicate no significant effect of linguistic background (F (1, 592] =2.051; P > 0.05) and linguistic meaning (F (1, 592) =0.317; P > 0.05), but the mean values were significantly different across pitch contour (F (1, 592) = 11.889; P = 0.001; Pη2 = 0.22). Bonferroni's post hoc test results revealed that the categorical boundary was shifted more toward the perception of stimulus with rising pitch, for both rising to falling and falling to rising pitch contours. The effect size was small, indicating that only 22% of total variance in perception may be attributed to pitch contour. On the superficial view, the results indicated that the categorical perception of pitch is independent of linguistic background and meaning of the stimulus.
|Figure 2: Mean fundamental frequency scores measured for pitch perception abilities across linguistic background, linguistic meaning, and pitch contours|
Click here to view
Detailed analysis of the results revealed that the width of the categorical boundary was larger for Kannada than for Mandarin language group as evident from [Figure 3]. [Figure 3]a is representing the categorical boundary for /ta/ (meaningful for Mandarin listeners and nonmeaningful for Kannada listeners) and /to/ (nonmeaningful for both Mandarin and Kannada listeners); [Figure 3]b is representing the same for/la/and/lo/; [Figure 3]c is representing for /da/ and /do/. For Mandarin participants, the categorical boundary varied approximately from step 7 to step 11 for the meaningful stimuli and from step 6 to step 13 for nonmeaningful stimuli, in the rising to falling contour. In falling to rising contour, the boundary varied approximately from step 5 to step 7 for meaningful stimuli and step 4 to step 9 for nonmeaningful stimuli. On the other hand, the categorical boundary range for Kannada group was approximately from step 5 to step 15 in the rising to falling contour and approximately from step 3 to step 13 in falling to rising contour, irrespective of linguistic meaning of the stimuli. The detailed description of the categorical boundary and the boundary width is shown in [Figure 3]. The standard deviation was also significantly more for Kannada than Mandarin participants (P< 0.001). Among the Mandarin participants, the standard deviation of F0 was more for nonmeaningful stimuli than for meaningful stimuli, but no such differences were observed within Kannada participants.
|Figure 3: Percentage correct responses for the identification of each point along the continuum as a function of linguistic meaning and pitch contour for the participants of both Mandarin and Kannada group for (a) /ta/-/to/syllable; (b) /la/-/lo/syllable; (c) /da/-/do/syllable. The two line drawings (I and II) for each group and syllable are complementary to each other, where I represents 0%–100% correct responses across steps and reciprocally II represents 100%–0% correct responses across steps|
Click here to view
| Discussion|| |
The present study highlighted the role of linguistic background (language tone), linguistic meaning, and pitch contour for the categorical perception of pitch. Pitch continuums for six monosyllables were constructed, among which three syllables were meaningful and remaining three were nonmeaningful. The meaningful syllables were those, where the pitch variation changed the linguistic meaning of the syllable in Mandarin language. The nonmeaningful syllables acoustically differed from meaningful syllables in terms of the difference in their vowel structure while keeping the consonantal portion of the syllable as same (for example, \ta\ and \to \ as meaningful and nonmeaningful syllables, respectively). This selection ruled out any inherent difference within the speech sounds and its effect on perception. The pitch contour was systematically varied in 15 steps along each continuum, and two complimentary contours were constructed one where the stimulus with rising pitch was at one end point falling pitch at other end point while keeping the offset constant and vice versa with onset being constant.
The results indicate that neither linguistic background nor meaning significantly varied the perception of pitch. However, standard deviation at the categorical boundary was wider and the width of the categorical boundary was more for Kannada language group. Boundary width is labeled as the region of ambiguity, i.e., the region along the continuum where the listeners were unsure about the exact categorization of the stimulus. Increased width indicated poorer performance of the Kannada participants and was related to the effect of linguistic background. Since, in Kannada language, the pitch variation does not change the meaning of the speech sound, exact categorization was difficult for its listeners. The finding was further strengthened by observing the individual responses at the categorical boundary across the five trials. The categorical boundary was relatively stable for Mandarin participants but highly variable for Kannada participants, indicating that they had more difficulty labeling the sounds on the basis of pitch contrast. The meaning of the stimuli also influenced pitch perception. The region of ambiguity was significantly wider for nonmeaningful than for meaningful stimuli, in Mandarin listeners. The difference was not observed for Kannada participants as all the syllables were nonmeaningful in Kannada language.
Multiple interpretations are drawn from these results. The pitch is perceived categorically, at least when there was a sharp change in the contour slope. The categorical perception of pitch is accurate for Mandarin listeners, whereas Kannada listeners were perceptually poor in categorizing pitch. Mandarin listeners perceived pitch changes as phonemic whereas Kannada listeners relied more on psychoacoustic factors. Thus, the pitch is perceived categorically for Mandarin listeners whereas “quasi categorically” for Kannada listeners. These findings were consistent with other observations.,
It was noted that despite lack of lexical tone contrast in Kannada language, the listeners were able to differentiate the tonal variation although not to a well-defined linguistic category. These differences pointed toward the listener's ability to rely on different acoustic cues for pitch perception, depending on the linguistic background. The tonal language listeners used pitch contour as the acoustic cue whereas the nontonal language listener gave more importance to pitch height. The finding was strengthened by the observing wide categorical boundary for Kannada language listeners and narrower boundary for Mandarin listeners. Such reports were available for English–Mandarin/Cantonese listeners also.,,,
| Conclusion|| |
The results of the present study indicated that the language tone and linguistic meaning has no significant influence on the pitch perception. However, the categorical boundary was wider for Kannada language participants and for non-meaningful stimuli. On the other hand, pitch contour significantly affect the perception of pitch. Thus, it may be concluded that there are differential perceptual processes which are dependent on the native languages
The authors extend their sincere gratitude to Dr. N. P. Nataraja, Director, JSS Institute of Speech and Hearing, Mysore, for granting permission to carry out this study. We also convey thanks to the entire participants for their kind cooperation throughout the testing procedure.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Seikel JA, King DW, Drumright DG. Anatomy & Physiology for Speech, Language, and Hearing. 4th
ed. Clifton Park, NY: Delmar Cengage Learning; 2010.
Cooper WE, Sorensen JM. Fundamental Frequency in Sentence Production. New York: Springer New York; 1981.
Hu Y, Wu D, Nucci A. Pitch-based gender identification with two-stage classification. Secur Commun Netw 2012;5:211-25.
Kumar P, Jakhanwal N, Bhowmick A, Chandra M. Gender Classification Using Pitch and Formants. New York, USA: ACM Press; 2011. p. 319.
Liu H, Russo NM, Larson CR. Age-related differences in vocal responses to pitch feedback perturbations: A preliminary study. J Acoust Soc Am 2010;127:1042-6.
Liu P, Chen Z, Jones JA, Huang D, Liu H. Auditory feedback control of vocal pitch during sustained vocalization: A cross-sectional study of adult aging. PLoS One 2011;6:e22791.
Loui P, Bachorik JP, Li HC, Schlaug G. Effects of voice on emotional arousal. Front Psychol 2013;4:675.
Wong PC, Ciocca V, Chan AH, Ha LY, Tan LH, Peretz I, et al.
Effects of culture on musical pitch perception. PLoS One 2012;7:e33424.
van Bezooijen R. Sociocultural aspects of pitch differences between Japanese and Dutch women. Lang Speech 1995;38(Pt 3):253-65.
Wang WS. Language Change. Ann N
Y Acad Sci 1976;280:61-72.
Xu Y, Gandour JT, Francis AL. Effects of language experience and stimulus complexity on the categorical perception of pitch direction. J Acoust Soc Am 2006;120:1063-74.
Looi V, Teo ER, Loo J. Pitch and lexical tone perception of bilingual English-mandarin-speaking cochlear implant recipients, hearing aid users, and normally hearing listeners. Cochlear Implants Int 2015;16 Suppl 3:S91-104.
Xi J, Zhang L, Shu H, Zhang Y, Li P. Categorical perception of lexical tones in Chinese revealed by mismatch negativity. Neuroscience 2010;170:223-31.
Yang B. Perception and Production of Mandarin Tones by Native Speakers and l2 Learners. New York: Springer; 2014.
Hallé PA, Chang YC, Best CT. Identification and discrimination of Mandarin Chinese tones by Mandarin Chinese vs. French listeners. J Phon 2004;32:395-421.
Francis AL, Ciocca V, Ng BK. On the (non) categorical perception of lexical tones. Percept Psychophys 2003;65:1029-44.
Chan SW. Cross-linguistic study of categorical perception for lexical tone. J Acoust Soc Am 1975;58:S119.
Sun KC, Huang T. A cross-linguistic study of Taiwanese tone perception by Taiwanese and English listeners. J East Asian Linguist 2012;21:305-27.
Peng G, Zheng HY, Gong T, Yang RX, Kong JP, Wang WS. The influence of language experience on categorical perception of pitch contours. J Phon 2010;38:616-24.
Baker RE, Baese-Berk M, Bonnasse-Gahot L, Kim M, Van Engen KJ, Bradlow AR, et al.
Word durations in non-native English. J Phon 2011;39:1-7.
Kaan E, Barkley CM, Bao M, Wayland R. Thai lexical tone perception in native speakers of Thai, English and Mandarin Chinese: An event-related potentials training study. BMC Neurosci 2008;9:53.
Klein D, Zatorre RJ, Milner B, Zhao V. A cross-linguistic PET study of tone perception in mandarin Chinese and English speakers. Neuroimage 2001;13:646-53.
Liss JM, Utianski R, Lansford K. Crosslinguistic application of English-centric rhythm descriptors in motor speech disorders. Folia Phoniatr Logop 2013;65:3-19.
Zhang Y, Nissen SL, Francis AL. Acoustic characteristics of English lexical stress produced by native mandarin speakers. J Acoust Soc Am 2008;123:4498-513.
Kuhl PK, Williams KA, Meltzoff AN. Cross-modal speech perception in adults and infants using nonspeech auditory stimuli. J Exp Psychol Hum Percept Perform 1991;17:829-40.
Yoo S, Chung JY, Jeon HA, Lee KM, Kim YB, Cho ZH, et al.
Dual routes for verbal repetition: Articulation-based and acoustic-phonetic codes for pseudoword and word repetition, respectively. Brain Lang 2012;122:1-0.
Wong PC, Warrier CM, Penhune VB, Roy AK, Sadehh A, Parrish TB, et al.
Volume of left Heschl's gyrus and linguistic pitch learning. Cereb Cortex 2008;18:828-36.
Warrier C, Wong P, Penhune V, Zatorre R, Parrish T, Abrams D, et al.
Relating structure to function: Heschl's gyrus and acoustic processing. J Neurosci 2009;29:61-9.
Working Group on Manual Pure-Tone Threshold Audiometry. Guidelines for Manual Pure-Tone Threshold Audiometry. Rockville, MD: American Speech-Language-Hearing Association; 2005.
Boersma P, Weenink D. Praat: Doing Phonetics by Computer; 2016. Available from: http://www.praat.org/
. [Last accessed on 2016 Jan 10].
Pisoni DB, Tash J. Reaction times to comparisons within and across phonetic categories. Percept Psychophys 1974;15:285-90.
Huang T, Johnson K. Language specificity in speech perception: Perception of mandarin tones by native and nonnative listeners. Phonetica 2010;67:243-67.
Chandrasekaran B, Gandour JT, Krishnan A. Neuroplasticity in the processing of pitch dimensions: A multidimensional scaling analysis of the mismatch negativity. Restor Neurol Neurosci 2007;25:195-210.
Chandrasekaran B, Krishnan A, Gandour JT. Relative influence of musical and linguistic experience on early cortical processing of pitch contours. Brain Lang 2009;108:1-9.
Chandrasekaran B, Krishnan A, Gandour JT. Experience-dependent neural plasticity is sensitive to shape of pitch contours. Neuroreport 2007;18:1963-7.
Chandrasekaran B, Krishnan A, Gandour JT. Mismatch negativity to pitch contours is influenced by language experience. Brain Res 2007;1128:148-56.
[Figure 1], [Figure 2], [Figure 3]