Tomoko Matsui scite author profile

We propose in this paper a new family of kernels to handle times series, notably speech data, within the framework of kernel methods which includes popular algorithms such as the Support Vector Machine. These kernels elaborate on the well known Dynamic Time Warping (DTW) family of distances by considering the same set of elementary operations, namely substitutions and repetitions of tokens, to map a sequence onto another. Associating to each of these operations a given score, DTW algorithms use dynamic programming techniques to compute an optimal sequence of operations with high overall score. In this paper we consider instead the score spanned by all possible alignments, take a smoothed version of their maximum and derive a kernel out of this formulation. We prove that this kernel is positive de nite under favorable conditions and show how it can be tuned effectively for practical applications as we report encouraging results on a speech recognition task.

show abstract

Absence of spontaneous action anticipation by false belief attribution in children with autism spectrum disorder

Senju

et al. 2010

View full text Add to dashboard Cite

Contact: lib-eprints@bbk.ac.uk Senju, A.; Southgate, V.; Miura, Y.; Matsui, T.; Hasegawa, T.; Tojo, Y.; Osanai, H.; Csibra, G. (2010) Absence of spontaneous action anticipation by false belief attribution in children with autism spectrum disorder Development and Psychopathology 22 (2), pp.353-360Children with ASD fail in a non-verbal false belief task AbstractRecently, a series of studies demonstrated false belief understanding in young children through completely non-verbal measures. These studies have revealed that children younger than 3 years of age, who consistently fail the standard verbal false belief test, can anticipate others' actions based on their attributed false beliefs. The current study examined whether children with autism spectrum disorder (ASD), who are known to have difficulties in the verbal false belief test, may also show such action anticipation in a non-verbal false belief test. We presented video stimuli of an actor watching an object being hidden in a box. The object was then displaced while the actor was looking away. We recorded children's eye movements and coded whether they spontaneously anticipated the actor's subsequent behaviour, which could only have been predicted if they had attributed a false belief to her. Although typically developing children correctly anticipated the action, children with ASD failed to show such action anticipation. The results suggest that children with ASD have an impairment in false belief attribution, which is independent of their verbal ability.Keywords: Autism Spectrum Disorder, Theory of Mind, Anticipation, Eye Tracking AcknowledgmentsWe would like to thank all the participants and parents who supported our study. We acknowledge Coralie Chevallier and John Swettenham for their inputs.

show abstract

On the role of language in children's early understanding of others as epistemic beings

Matsui

Yamamoto

McCagg

2006

Cognitive Development

View full text Add to dashboard Cite

Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMMs

Matsui

Furui

1992

118

View full text Add to dashboard Cite

Concatenated phoneme models for text-variable speaker recognition

Matsui

Furui

1993

121

View full text Add to dashboard Cite

Music Genre and Emotion Recognition Using Gaussian Processes

Markov

Matsui

2014

IEEE Access

View full text Add to dashboard Cite

Gaussian Processes (GPs) are Bayesian nonparametric models that are becoming more and more popular for their superior capabilities to capture highly nonlinear data relationships in various tasks, such as dimensionality reduction, time series analysis, novelty detection, as well as classical regression and classification tasks. In this paper, we investigate the feasibility and applicability of GP models for music genre classification and music emotion estimation. These are two of the main tasks in the music information retrieval (MIR) field. So far, the support vector machine (SVM) has been the dominant model used in MIR systems. Like SVM, GP models are based on kernel functions and Gram matrices; but, in contrast, they produce truly probabilistic outputs with an explicit degree of prediction uncertainty. In addition, there exist algorithms for GP hyperparameter learning-something the SVM framework lacks. In this paper, we built two systems, one for music genre classification and another for music emotion estimation using both SVM and GP models, and compared their performances on two databases of similar size. In all cases, the music audio signal was processed in the same way, and the effects of different feature extraction methods and their various combinations were also investigated. The evaluation experiments clearly showed that in both music genre classification and music emotion estimation tasks the GP performed consistently better than the SVM. The GP achieved a 13.6% relative genre classification error reduction and up to an 11% absolute increase of the coefficient of determination in the emotion estimation task. INDEX TERMSMusic genre classification, music emotion estimation, Gaussian processes.

show abstract

Estimation of Spatially Correlated Random Fields in Heterogeneous Wireless Sensor Networks

Nevat

Peters

Septier

et al. 2015

IEEE Trans. Signal Process.

View full text Add to dashboard Cite

International audienceWe develop new algorithms for spatial field re- construction, exceedance level estimation and classification in heterogeneous (mixed analog & digital sensors) Wireless Sensor Networks (WSNs). We consider spatial physical phenomena which are observed by a heterogeneous WSN, meaning that it consists partially of sparsely deployed high-quality sensors and partially of low-quality sensors. The high-quality sensors transmit their (continuous) noisy observations to the Fusion Centre (FC), while the low-quality sensors first perform a simple thresholding operation and then transmit their binary values over imperfect wireless channels to the FC. The resulting observations are mixed continuous and discrete (1-bit decisions) observations, and are combined in the FC to solve the inference problems. We first formulate the problem of spatial field reconstruction, exceedance level estimation and classification in such heterogeneous networks. We show that the resulting posterior predictive distribution, which is key in fusing such disparate observations, involves intractable integrals. To overcome this problem, we develop an algorithm that is based on a multivariate series expansion approach resulting in a Saddle-point type approximation. We then present comprehensive study of the performance gain that can be obtained by augmenting the high-quality sensors with low-quality sensors using real data of insurance storm surge database known as the Extreme Wind Storms Catalogue

show abstract

Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics

Lane

Kawahara

Matsui

et al. 2007

IEEE Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

One significant problem for spoken language systems is how to cope with users' out-of-domain (OOD) utterances which cannot be handled by the back-end application system. In this paper, we propose a novel OOD detection framework, which makes use of the classification confidence scores of multiple topics and applies a linear discriminant model to perform in-domain verification. The verification model is trained using a combination of deleted interpolation of the in-domain data and minimum-classification-error training, and does not require actual OOD data during the training process, thus realizing high portability. When applied to the "phrasebook" system, a single utterance read-style speech task, the proposed approach achieves an absolute reduction in OOD detection errors of up to 8.1 points (40% relative) compared to a baseline method based on the maximum topic classification score. Furthermore, the proposed approach realizes comparable performance to an equivalent system trained on both in-domain and OOD data, while requiring no OOD data during training. We also apply this framework to the "machine-aided-dialogue" corpus, a spontaneous dialogue speech task, and extend the framework in two manners. First, we introduce topic clustering which enables reliable topic confidence scores to be generated even for indistinct utterances, and second, we implement methods to effectively incorporate dialogue context. Integration of these two methods into the proposed framework significantly improves OOD detection performance, achieving a further reduction in equal error rate (EER) of 7.9 points.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tomoko Matsui

A Kernel for Time Series Based on Global Alignments

Absence of spontaneous action anticipation by false belief attribution in children with autism spectrum disorder

On the role of language in children's early understanding of others as epistemic beings

Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMMs

Concatenated phoneme models for text-variable speaker recognition

Music Genre and Emotion Recognition Using Gaussian Processes

Estimation of Spatially Correlated Random Fields in Heterogeneous Wireless Sensor Networks

Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics

Contact Info

Product

Resources

About