Automatic speech processing devices have become popular for quantifying amounts of ambient language input to children in their home environments. We assessed error rates for language input estimates for the Language ENvironment Analysis (LENA) audio processing system, asking whether error rates differed as a function of adult talkers' gender and whether they were speaking to children or adults. Audio was sampled from within LENA recordings from 23 families with children aged 4-34 months. Human coders identified vocalizations by adults and children, counted intelligible words, and determined whether adults' speech was addressed to children or adults. LENA's classification accuracy was assessed by parceling audio into 100-ms frames and comparing, for each frame, human and LENA classifications. LENA correctly classified adult speech 67% of the time across families (average false negative rate: 33%). LENA's adult word count showed a mean +47% error relative to human counts. Classification and Adult Word Count error rates were significantly affected by talkers' gender and whether speech was addressed to a child or an adult. The largest systematic errors occurred when adult females addressed children. Results show LENA's classifications and Adult Word Count entailed randomand sometimes largeerrors across recordings, as well as systematic errors as a function of talker gender and addressee. Due to systematic and sometimes high error in estimates of amount of adult language input, relying on this metric alone may lead to invalid clinical and/or research conclusions. Further validation studies and circumspect usage of LENA are warranted.
Purpose
Differences across language environments of prelingually deaf children who receive cochlear implants (CIs) may affect language acquisition; yet, whether mothers show individual differences in how they modify infant-directed (ID) compared with adult-directed (AD) speech has seldom been studied. This study assessed individual differences in how mothers realized speech modifications in ID register and whether these predicted differences in language outcomes for children with CIs.
Method
Participants were 36 dyads of mothers and their children aged 0;8–2;5 (years;months) at the time of CI implantation. Mothers' spontaneous speech was recorded in a lab setting in ID or AD conditions before ~15 months postimplantation. Mothers' speech samples were characterized for acoustic–phonetic and lexical properties established as canonical indices of ID speech to typically hearing infants, such as vowel space area differences, fundamental frequency variability, and speech rate. Children with CIs completed longitudinal administrations of one or more standardized language assessment instruments at variable intervals from 6 months to 9.5 years postimplantation. Standardized scores on assessments administered longitudinally were used to calculate linear regressions, which gave rise to predicted language scores for children at 2 years postimplantation and language growth over 2-year intervals.
Results
Mothers showed individual differences in how they modified speech in ID versus AD registers. Crucially, these individual differences significantly predicted differences in estimated language outcomes at 2 years postimplantation in children with CIs. Maternal speech variation in lexical quantity and vowel space area differences across ID and AD registers most frequently predicted estimates of language attainment in children with CIs, whereas prosodic differences played a minor role.
Conclusion
Results support that caregiver language behaviors play a substantial role in explaining variability in language attainment in children receiving CIs.
Supplemental Material
https://doi.org/10.23641/asha.12560147
Calibration and higher order statistics (HOS) are standard components of image steganalysis. However, these techniques have not yet found adequate attention in audio steganalysis. Specifically, most of current studies are either non-calibrated or only based on noise removal. The goal of this paper is to fill these gaps and to show that calibrated features based on re-embedding technique improves performance of audio steganalysis. Furthermore, we show that least significant bit (LSB) is the most sensitive bit-plane to data hiding algorithms and therefore it can be employed as a universal embedding method. The proposed features also benefit from an efficient model which is tailored to the needs for audio steganalysis and represent the maximum deviation from human auditory system (HAS). Performance of the proposed method is evaluated on a wide range of data hiding algorithms in both targeted and universal paradigms. The results show the effectiveness of the proposed method in detecting the finest traces of data hiding algorithms in very low embedding rates. The system detects steghide at capacity of 0.06 bit per symbol (BPS) with sensitivity of 98.6% (music) and 78.5% (speech). These figures are respectively 7.1% and 27.5% higher than the state-of-the-art results based on RMFCC features.
1-the intended message ( ∈ ℳ) inside a host signal, namely called cover ( ∈ ). Steganography methods can be classified into categories of text, audio, image, video, and network traffics, depending on the type of cover signal.Steganalysis is the countermeasure of steganography which aims to detect the presence of hidden messages. Likewise, steganalysis methods may be classified according to the type of cover into categories of text, audio, image, video, and network traffics. Steganalysis in each of these categories can be further divided into targeted and universal methods. In the former, the embedding algorithm is known, whereas there is no prior assumption about the embedding algorithm in the later one [6].One of the first audio steganalysis method was proposed in [7] where cover signal was estimated by de-noising the signal under inspection. Audio quality metrics (AQMs) were used to quantify the discrepancies between the original signal and its estimated cover [7]. Hausdroff distance was proposed as a solution to the inefficiency of AQMs in detecting traces of hidden data [8]. In [9], negative effect of high correlation between the features extracted from these denoising methods and their signals was solved.All of these previous works are similar in that, they have used indirect methods for comparing between stegos and their estimated covers. However, conducting this comparison on the distributions of stegos and covers are more appropriate. This approach was pursued in [10], where it was shown that the degree of histograms flatness derived from wavelet coefficients of stegos and their cover counterparts is a discriminative criterion. Gaussian mixture model (GMM) and generalized Gaussian distribution (GGD) were used to capture this...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.