only the acoustic content of speech to trying to exploit highlevel information. It has been reported in several studies that Recently, various studies have shown that high-levelfeagains in speaker recognition accuracy are possible by extures, such as linguistic content, pronunciation and idiolecploiting such high-level information sources (see e.g.[17]). tal word usage, convey more speaker information and can The most examined high-level information for speaker be added to the low-level features in order to increase the verification are: the prosody [19, 1], the phonetic informarobustness of the system. Usually these features are extion [15,14,13,11], and the idiolectal word and phone tracted by analyzing streams produced by phonetic speech usage [7, 2, 5, 12]. All these approaches reported encourrecognition systems. Two of the major problems that arise aging results and were found to provide features complewhen phone based systems are being developed are the posmentary to short-term acoustic features. However most of sible mismatches between the development and evaluation them are based on phonetic transcriptions that are errordata and the lack of transcribed databases. We propose in prone and expensive to create. Beside this, the transcribed this paper to replace the phone-based approaches by datadatabases need also to be updated with new data sets in ordriven segmentation methodologies. Our data-driven highder to match with potentially new specifications (channel, level systems do not use transcribed data and can easily be microphones, context of use, ...) of the verification data.applied on development data minimizing the mismatches. An alternative approach that solves these two problems is These systems were fused with a state-of-the-art acoustic using data-driven phone-like units derived directly from unGaussian Mixture Models (GMM) system. Results obtained transcribed speech. This way the availability of corpora is on the NIST 2006 Speaker Recognition Evaluation data much less an issue and the training corpus can be chosen to show that the data-driven features provide complementary match the working conditions as much as possible. information and the resulting fused system reduced the erThis paper is the continuation of previous attempts ror rate in comparison to the GMM baseline system.to model high-level information using data-driven approaches [9,8]. The focus here is on the fusion of different systems that exploit data-driven high-level source of in-