|Year : 2015 | Volume
| Issue : 2 | Page : 21-27
Development of phrase recognition test in Kannada language
Hemanth Narayan Shetty, Akshay Mendhakar
Department of Audiology, All India Institute of Speech and Hearing, Mysore, Karnataka, India
|Date of Web Publication||11-Jul-2016|
Dr. Hemanth Narayan Shetty
Lecturer in Audiology, All India Institute of Speech and Hearing, Mysore, Karnataka
Source of Support: None, Conflict of Interest: None
Context: Sentences are rich in redundancy, and therefore, their identification is often facilitated by the context. The use of phrases introduces limited contextual cues into the process of identification and facilitates the evocation of words. Thus, there is a need to develop phrase recognition test to assess identification abilities. Aims: To develop and validate phrase recognition test in Kannada language for assessing speech recognition in noise. Settings and Design: Normative research design was utilized. Subjects and Methods: A total of 70 phrases in Kannada language were constructed and 67 of them were selected based on familiarity rating. Ten participants each in two groups were involved for the list equivalency and validation. Statistical Analysis Used: Repeated measure of analysis of variance was utilized for the lists equivalency and standardization. Results: Sixty-seven phrases were shortlisted from 70 phrases through familiarity rating. These phrases were embedded in different 5 signal to noise ratios (SNRs) (−9 dB SNR to −1 dB SNR in steps of 2 dB). Analysis of results showed 50% recognition score at ~−5 dB SNR. In addition, the phrases that were too easy and too difficult were eliminated. From the remaining phrases, five lists of 10 phrases each were constructed and compared for their equal intelligibility in noise. The results revealed no significant differences across the phrase lists. Conclusions: The homogenous five lists of the Kannada phrase recognition test will be useful to assess identification ability of the listeners and hearing aid benefit.
Keywords: Kannada, phrase, recognition, speech in noise
|How to cite this article:|
Shetty HN, Mendhakar A. Development of phrase recognition test in Kannada language. J Indian Speech Language Hearing Assoc 2015;29:21-7
|How to cite this URL:|
Shetty HN, Mendhakar A. Development of phrase recognition test in Kannada language. J Indian Speech Language Hearing Assoc [serial online] 2015 [cited 2020 Jul 8];29:21-7. Available from: http://www.jisha.org/text.asp?2015/29/2/21/185976
| Introduction|| |
In clinical audiology, hearing ability is assessed using pure tones of different frequencies. However, pure tone audiometry does not give a complete understanding of one's ability to recognize speech. Therefore, additional speech identification tests are carried out for which speech materials such as monosyllables, bisyllables, and spondees are used. Although performance scores on speech material such as monosyllables, bisyllables, and spondees provide an idea of person's ability to understand speech, they pose limitations such as convey minimal semantic and syntactic contents, and far from day-to-day naturalistic conversation. Utilizing monosyllables in rehabilitation confers relatively less relationship between identification score and real world hearing aid benefit, Furthermore, stimuli such as spondees seldom occur in conversation, and the variation in intonation, stress, and pauses that occur in spondees are far from being representative of natural communication, and words having two syllables are not considered best in evaluating the parameters of hearing aid, i.e., isolated words end before the algorithm (longer compression release time and noise reduction strategy) takes its full action. Thus, these limitations accentuate the need for longer length of speech materials.
Geetha et al. developed and standardized a Kannada sentence test, in which 50% identification score was obtained at −5 dB signal to noise ratios (SNR). In a similar line of study, Chandini et al. developed and standardized the Hindi sentence test. A −4 dB SNR is required to obtain 50% sentence recognition score. In clinical setting, using sentences as speech materials overestimates the performance in recognition due to its redundancy.
Thus, in clinical setting, assessing speech ability from a client requires speech material that has lesser redundancy than sentences, warrants lesser linguistic competency, is close to the naturalistic situation, and estimates the genuine advantage of particular parameter of hearing aid (compression ratio, release time, directional array, and noise reduction algorithms) in the rehabilitative process. Phrases take into account of the above-mentioned qualities and therefore qualifies to be an appropriate speech material. It eliminates drawback exists in other speech materials in the process of assessing the listener's ability to recognize speech. A phrase consists of one or more words forming a grammatical constituent of a sentence and infers incomplete content of information.
Most often clients complain of difficulty to follow speech in the presence of background noise. In addition, speech occurring in isolation is seldom in a naturalistic situation. Thus, in the present study, the phrases were embedded in noise at different signal to noise levels to account for everyday communication scenario that the listener undergoes in understanding speech in varied degrees of background noise. Assessing speech perception with phrases in noise will also be helpful while accounting for the benefit from amplification to the beneficiaries and in turn will be useful in counseling the patient regarding their expectation from hearing devices, especially when listening in background noise. Further, in recent decades, there has been marked improvement in the hearing aid technology such as noise reduction, directionality, compression, and expansion. Hence, more number of lists would be required to compare across the parameters in the hearing aid without compromising on the familiarity of these materials. Thus, the aim of the present study was to develop and validate a test of phrase recognition in noise, in Kannada language. The objectives of the study were: (i) To develop a test of phrase recognition in noise and (ii) to validate the developed Kannada phrase test in noise on the normal hearing group.
| Subjects and Methods|| |
The study was conducted in two phases. In the first phase, the Kannada phrases were selected and embedded in noise at different SNRs for obtaining 50% recognition threshold and also to construct the phrase lists which were equally intelligible in noise. In the second phase, the developed phrase test material was validated on normal hearing group.
Phase-1: Identification of phrases in Kannada for assessing speech recognition threshold
Phase-1 had two experiments. In the first experiment, the phrases in the Kannada language were constructed, recorded, and mixed in noise at different SNRs. In the second experiment, the phrases that were equally intelligible in noise were selected and composition of lists was made.
Seventy phrases were collected from upper primary school children's textbooks (5th–7th grade), magazines, day-to-day conversation, and the internet. The selected phrases had to fulfill the criteria specified by Versfeld et al. The phrases chosen consisted of two words. Each word should have 3–4 syllables. The phrases collected were ensured to have correct syntax and semantic structures, which were further analyzed with the help of a linguist for accuracy.
The phrases selected comprised noun phrases, verb phrases, adverb phrases, adjective phrases, and preposition phrases (PP). The average duration of the phrases was 1.5 s. These 70 phrases were given to ten native speakers of Kannada language for the familiarity check. Based on their ratings, those phrases which were rated highly familiar on three point rating scale of familiarity test were selected (3 = highly familiar; 2 = familiar; and 1 = not familiar).
Three phrases from 70 phrases were not selected as it failed to meet the criteria of familiarity. The selected 67 phrases from the familiarity test were recorded from three female-trained singers, who are native speakers of Kannada. A recording microphone (Ahuja, AUD-101XLR) (having a frequency response from 20 Hz to 20 kHz) was placed at 10 cm away from the speaker's mouth. Each speaker was informed to clearly articulate the phrase with normal vocal effort and was insisted to maintain natural intonation pattern. Adobe Audition (version 3) Syntrillium Software company, Phoenix, Arizon, United States software was used to record the phrases at a sampling rate of 44.1 kHz with 24-bit resolution. All the 67 phrases were saved as.wav files. These phrases were adjusted to an average root mean square (RMS) level of −20 dB (maximum digital output) with maximum peak levels of approximately −5 dB. The resultant 201 amplitude normalized phrases (67 phrases × 3 speakers) were presented to ten normal hearing adults at their comfortable level for goodness test. Only those phrases that were rated“4” or“5” on a five point rating scale of naturalness were selected. The five point rating scale of naturalness used was as follows.
- Totally unnatural and not encountered at all
- Somewhat unnatural, it is unlikely that one such sentence is encountered
- Sentence is unusual, but you may have heard
- Natural, but less frequently encountered in everyday conversation
- Natural and frequently encountered in everyday conversation.
Generation of noise
A speech-shaped noise was generated to match the long-term average spectrum of phrase material. This was done to accurately obtain a slope of psychometric function and to determine highly reliable speech recognition score. Phrases were randomly selected and concatenated. To the concatenated phrases, the fast Fourier transformer (FFT) was performed. The phase of the FFT was randomized and converted back to wave file by the inverse FFT. The noise generated had only little amplitude variation and a frequency spectrum that corresponded with the long-term average spectrum of the phrases. The RMS level of the noise was matched to the same level of the phrases. [Figure 1] depicts long-term average speech spectrum (LTASS) of phrase and the phrase spectral-shaped noise.
|Figure 1: Depicts long-term average speech spectrum of phrase and the phrase spectral-shaped noise|
Click here to view
Mixing noise at various signal to noise ratios to the phrases
The 67 phrases were embedded in each of the 6 SNRs, i.e., from −9 to −1 dB, in 2 dB steps. The following steps were followed to mix noise at each SNR to the phrases. Initially, the RMS of each phrase was computed by the MATLAB code to which the speech spectrum-shaped noise was mixed at −9 dB SNR digitally. The noise onset preceded the onset of phrase by 300 ms and continued till 300 ms after the end of the phrase. The noise was ramped using the Cosine square function with ramp duration of 100 ms. The onset of the noise before the phrase is believed to guard against unintended onset effects. Similar procedure was used to embed the 67 phrases in each target SNR (−7 dB, −5 dB, −3 dB, and −1 dB). The levels in SNR were chosen based on the findings of previous investigators.
A pilot study was conducted to obtain SNR at which 50% recognition was present for the phrases. This was achieved to minimize the ceiling and the floor effects. Ten listeners in the age range of 19 and 26 years (mean age of 22 years) were selected. All participants had normal hearing sensitivity and their air conduction thresholds at octave frequencies from 0.25 kHz to 8 kHz were <15 dB and had normal middle ear status on immittance evaluation as indicated by type“A” tympanogram. A total of 335 phrases in the prerecorded SNRs (67 × 5 SNRs) were randomized and presented binaurally through TDH-39 headphones at their most comfortable level. For the correct recognition of whole phrase, a score of“one” mark was assigned. A score of“zero” mark was given if the participant failed to recognize the phrase or if they identified one of the words in the phrase correctly. The total number of correctly recognized phrases in each SNR was calculated and converted into a percentage. The results revealed that 50% recognition was obtained approximately at −5 dB SNR
Composition of lists
In each phrase list of SNR, the phrases which have the value of ± 1 standard deviation (SD) above the mean recognition were eliminated. This was done to have uniform list devoid of too easy and too difficult phrases. After removal, from the remaining 55 phrases, five lists of 10 phrases were made [Appendix]. The remaining five phrases were used as familiarity item. In addition, it was confirmed that the phrase lists were phonemically balanced. An expert in speech analysis was asked to do phonetic transcription for the selected 50 phrases. The frequency of occurrences of each phoneme was noted and then divided by the total number of lists to be compiled. That is, frequency distribution of each phoneme within the test list was averaged. This was done in all the lists to document minimum values and maximum values of the respective phoneme. It was made sure that total number of occurrence of each phoneme was same across the lists. Further, average frequency of occurrences of each phoneme was made approximately same to that of average phoneme frequency distribution of Kannada language. Another group of ten normal hearing adults in the age range of 21–30 years (mean age 22 years) were involved in the study to test the lists for equivalency in terms of its intelligibility. The five phrase lists were presented to each participant at their most comfortable level. The adaptive up-down procedure  was used to obtain 50% speech recognition threshold in noise (SNR 50) for the phrases in each list.[Additional file 1]
An APEX presentation software (ExpORL, Department Neurosciences, KU Leuven, Belgium) was utilized to present the phrases. This software was loaded on a personal computer. The output of the computer was connected to the auxiliary input of the audiometer. The output of audiometer was delivered through TDH-39 headphones binaurally at the participants' most comfortable loudness level. An adaptive procedure (1 down 1 up) was used to obtain SNR 50. Initially, the first phrase was presented at −7 dB SNR (below 50% correct identification) and if the participant correctly repeated, then the next phrase was presented at the same SNR. Again, if a phrase was repeated correctly, then the following phrase was presented at a lower SNR (phrase level reduced by 2 dB with constant noise level). If the phrase was repeated incorrectly, the next phrase was presented at a higher SNR (phrase level increased by 2 dB). After the presentation of 10 phrases, the software calculated the SNR-50 by taking the average of last three reversals. This was done in each phrase list.
Phase-2: To validate the developed Kannada phrase in noise test on normal hearing group
Phase-2 aimed at validating the phrase test developed in Phase-1 on a group of 10 normal hearing participants. Phrase recognition scores for each list at −9 dB and −1 dB SNR were obtained. Further, 50% correct phrase recognition score in noise was estimated using the adaptive procedure on individuals with normal hearing sensitivity. Another set of ten participants with normal hearing in the age ranged from 19 to 30 years (mean age 22 years) were included in the Phase-2 study. The five lists of phrases in each SNR, i.e., −9 dB (floor effect) and −1 dB (ceiling effect) were presented randomly through TDH-39 headphone binaurally at the individual's MCL. Then, each participant was asked to repeat the phrase heard. For the correct recognition of each phrase, a score of“one” mark was assigned and for the incorrect recognition of the phrase a mark of“zero” was allocated. In addition, on same participants, the 50% speech recognition threshold in the presence of noise (SNR 50) for the phrases in each list was measured. The procedure to identify SNR 50 was as provided earlier.
| Results|| |
Phase-1: Equally intelligible Kannada phrase lists at different signal to noise ratios
The recognition scores obtained from 67 phrases were plotted against each SNR. In the psychometric function, the 50% recognition score was obtained approximately at −5 dB SNR. In addition, those phrases which are ± 1 SD value above the mean recognition score were eliminated to avoid ceiling and floor effect. It was found that seven phrases were above the mean and five phrases were below the mean. The 12 phrases were eliminated from the total of 67 phrases. With the remaining 55 phrases, five lists were made in which each list comprised 10 phrases. The leftover five phrases were utilized as familiarity item.
Further, to check the lists which were equally intelligible in noise, the SNR 50 was identified from each list. The mean SNR 50, SD for each list, and average SNR 50 are provided in [Table 1]. From [Figure 2], it was found that for each list, the mean recognition score decreased with lesser SNR than compared to higher SNR. The SNR 50 across lists varied from −5.36 dB to −5.64 dB, with an SD ranging from 0.37 dB to 0.50 dB. The average SNR for the five lists was −5.45 dB, with an SD of 0.45 dB. In addition, the mean SNR 50 in each list was subtracted from the average SNR 50 to see variation in each list from average SNR 50 [fourth column of [Table 1]. Overall, the SNR 50 varied across lists from −0.013 dB to 0.19 dB with a SD of 0.45 dB. To assess whether the mean difference across the lists reached significance, a repeated measures analysis of variance (r-ANOVA) was performed. The result revealed that there was no significant difference (F (4, 36) = 0.208, P = 0.509) across lists.
Phase-2: Validation of phrase recognition test in noise in Kannada language on normal hearing participants
The Phase-2 was carried out to validate the Kannada phrase lists in −9 dB SNR (floor effect in psychometric function) and −1 dB SNR (ceiling effect in psychometric function) on ten normal hearing participants. In addition, SNR 50 was obtained from same participants. The mean recognition score in −9 dB SNR and −1 dB SNR and SNR 50 for each list are given in [Table 2].
|Table 2: Recognition of scores across phrase lists at-9 dB SNR, -1 dB SNR, and SNR 50|
Click here to view
To assess whether the mean differences across the lists in each SNR (i.e., −9 dB SNR and −1 dB SNR) and SNR 50 reached significance, separate repeated measures ANOVA were performed. The results revealed that there was no significant difference in −9 dB SNR (F (4, 36) =0.62, P = 0.651), −1 dB SNR (F (4, 36) = 6.92, P = 0.120), and SNR 50 (F (4, 36) =0.261, P = 0.489) across the lists.
| Discussion|| |
The aim of the present study was to develop and validate a phrase recognition test in noise in Kannada language using adaptive procedure. The phrases selected in the first phrase were tested for recognition at five different SNRs. The phrase recognition at each SNR was used to construct psychometric function from which the level of intensity at which SNR 50 was derived. To have equal intelligibility in the selected phrases, the psychometric function and SNR 50 were used to exclude the phrases that differed ± 1 SD above the mean SNR 50. Further, optimization measure was utilized in selecting the phrases which includes phrases framed were syntactically and semantically correct to preserve the naturalness. The grammatical aspects in phrases were verified by a linguist. In addition, the most familiar phrases were selected and the phrase materials were designed to have simple grammatical features to minimize the involvement of listeners cognitive and linguistics abilities on recognition of phrases. All the phrases had two words of 3–4 syllables such that entire length of phrases was made nearly equal.
In adaptive speech intelligibility testing such as hearing in noise test, threshold of intelligibility is expressed in terms of SNR and the masking effect of noise is determined by RMS of noise and compared with the RMS of speech. Unfortunately, spectral and temporal variation of masking noise in relation to speech is more likely and there is a chance of error in specified SNR. It means, as the SNR varied, the masking effect by a noise depends on the relationship of its spectrum to that of speaker's voice. A solution is to prepare noise by matching the spectrum of the masker to that of long-term average spectrum of the speaker voice uttered in a particular language. Thus, in the present study, noise was derived using inverse FFT for the concatenated phrases. The LTASS of derived noise was matched with that of phrase spectrum [Figure 1]. This was done as the spectral and temporal energetic properties of phrase spectrum shape noise can have equal intelligibility across lists. For assessing intelligibility across lists, it is easier to obtain equal intelligibility for complete utterance of phrase rather than word present in it. Thus, scoring was carried out for the complete utterance of the phrase rather than word. In utilizing this method, the 50% phrase recognition score was obtained at −5 dB SNR. As expected, at low SNR, the phrase recognition score was less, whereas, at high SNR, the recognition score was high [Figure 2].
Further, to have the equal intelligibility of phrase lists in noise, the SNR 50 was established in each phrase list and then compared across lists. The results revealed no significant difference (P = 0.509) between lists in the SNR 50, which indicates that phrase lists are equally intelligible. Overall, the SNR 50 varied across lists from −0.013 dB to 0.19 dB with a SD of 0.45 dB. From these data, it can be concluded that the SD across lists, measured in this way, accounted <0.5 dB difference. This low variability is partly caused by the exclusion of the phrases, which are differed by ± 1 SD above the mean SNR 50. It implies that equalization procedure in the present study led to the conclusion that the phrases are homogenous. In addition, phonemic transcription of phrases was analyzed and its frequency distribution of each phoneme in each list was matched with the overall frequency of occurrence of each phoneme in Kannada language. A trial and error method was used to exchange phrases between lists in an effort to match the distribution of each phoneme to the overall distribution as closely as possible. By doing this, the phrases in each list are made phonemically balanced. This implies that the variability decreases with the number of phrase lists and therefore motivates the use of at least two lists to account a high accuracy in SNR 50.
The Phase-2 of the present study aimed at validating of the developed phrase lists in −9 dB SNR, −1 dB SNR, and SNR 50. The average SNR 50 for each of the five lists was −5.29 dB, which is in agreement with the SNR 50 obtained from psychoacoustic function of Phase-1. In the present study, the average SD from all the lists is ±0.26 dB. It infers that phrase lists are stable. The SD of SNR 50 found in the present study shows that the phrase lists are homogeneous and equally intelligible in the presence of noise. Further, research on the validation of developed Kannada phrase lists in clinical population is warranted.
| Conclusions|| |
The SNR 50 was obtained at −5 dB SNR using adaptive method. The material consists of five phrase lists and each list has ten phrases. These phrase lists are highly homogeneous. The results indicated that the developed phrase lists are valid and satisfy all the parameters of a good speech identification test. The phrase lists developed are useful for assessing listener's recognition ability and hearing aid fitting in clinics.
The author would like to thank the Director and the HOD Audiology, All India Institute of Speech and Hearing, for permitting us to utilize the instruments to conduct this study. The authors would also like to thank all the participants of the study for their cooperation. Our sincere thanks to the associate and the assistant editor for scrutinizing the manuscript and shaped it to the effortless readable format.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
McArdle R, Hnath-Chisolm T. Speech audiometry. In: Katz J, Medwetsky LR, Burkard R, Hood L, editors. Handbook of Clinical Audiology. 6th
ed. Baltimore, Maryland: Lippincott Williams and Wilkins; 2009. p. 64-79.
Gatehouse S, Robinson K. Speech tests as measures of auditory processing. In: Martin M, editor. Speech and Audiometry. London: Whurr; 1997. p. 74-88.
Cox RM, Alexander GC, Gilmore C. Development of the connected speech test (CST). Ear Hear 1987;8 5 Suppl: 119S-26S.
Nilsson M, Soli SD, Sullivan JA. Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise. J Acoust Soc Am 1994;95:1085-99.
Geetha C, Sharath KS, Manjula P, Pawan P. Development and standardisation of the sentence identification test in the Kannada language. J Hear Sci 2014;4:18-26.
Chandini J, Narne VK, Singh NK, Kumar P. Development of sentence test for speech recognition in Hindi. AIISH Research Fund Project. 2014.
Osborne T, Putnam M, Gross T. Bare phrase structure, label-less structures, and specifier-less syntax: Is minimalism becoming a dependency grammar? Linguist Rev 2011;28:315-64.
Fitzgibbons PJ, Gordon-Salant S. Age effects in discrimination of intervals within rhythmic tone sequences. J Acoust Soc Am 2015;137:388-96.
Versfeld NJ, Daalder L, Festen JM, Houtgast T. Method for the selection of sentence materials for efficient measurement of the speech reception threshold. J Acoust Soc Am 2000;107:1671-84.
Winholtz WS, Titze IR. Conversion of a head-mounted microphone signal into calibrated SPL units. J Voice 1997;11:417-21.
Plomp R, Mimpen AM. Improving the reliability of testing the speech reception threshold for sentences. Audiology 1979;18:43-52.
[Figure 1], [Figure 2]
[Table 1], [Table 2]