Grégory Senay scite author profile

Grégory Senay

4Publications

35Citation Statements Received

48Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Avignon, Laboratoire Informatique d'Avignon

Publications

Order By: Most citations

Unsupervised face identification in TV content using audio-visual sources

Bendris

Favre

Charlet

et al. 2013

View full text Add to dashboard Cite

Our goal is to automatically identify faces in TV content without pre-defined dictionary of identities. Most of methods are based on identity detection (from OCR and ASR) and require a propagation strategy based on visual clusterings. In TV content, people appear with many variation making the clustering very difficult. In this case, identifying speakers can be a reliable link to identify faces. In this work, we propose to combine reliable unsupervised face and speaker identification systems through talking-faces detection in order to improve face identification results. First, OCR and ASR results are combined to extract locally the identities. Then, the reliable visual associations are used to propagate those identities locally. The reliable identified faces are used as unsupervised models to identify similar faces. Finally speaker identities are propagated to the faces in case of lip activity detection. Experiments performed on the REPERE database show an improvement of the recall of +5% compared to the baseline, without degrading the precision.

show abstract

Technical Improvements of the E-HMM Based Speaker Diarization System for Meeting Records

Fredouille

Senay

2006

View full text Add to dashboard Cite

International audienceThis paper is concerned with the speaker diarization task in the specific context of the meeting room recordings. Firstly, different technical improvements of an E-HMM based system are proposed and evaluated in the framework of the NIST RT'06S evaluation campaign. Related experiments show an absolute gain of 6.4% overall speaker di-arization error rate (DER) and 12.9% on the development and evaluation corpora respectively. Secondly, this paper presents an original strategy to deal with the overlapping speech. Indeed, speech overlaps between speakers are largely involved in meetings due to the spontaneous nature of this kind of data and they are responsible for a decrease in performance of the speaker di-arization system, if they are not dealt with. Experiments still conducted in the framework of the NIST RT'06S evaluation show the ability of the strategy in detecting overlapping speech (decrease of the missed speaker error rate), even if an overall gain in speaker diarization performance has not been achieved yet

show abstract

Person name recognition in ASR outputs using continuous context models

Bigot

Senay

Linarès

et al. 2013

View full text Add to dashboard Cite

International audienceThe detection and characterization, in audiovisual documents, of speech utterances where person names are pronounced, is an important cue for spoken content analysis. This paper tackles the problematic of retrieving spoken person names in the 1-Best ASR outputs of broadcast TV shows. Our assumption is that a person name is a latent variable produced by the lexical context it appears in. Thereby, a spoken name could be derived from ASR outputs even if it has not been proposed by the speech recognition system. A new context modelling is proposed in order to capture lexical and structural information surrounding a spoken name. The fundamental hypothesis of this study has been validated on broadcast TV documents available in the context of the REPERE challenge

show abstract

A segment-level confidence measure for Spoken Document Retrieval

Senay

Linarès

Lecouteux

2011

View full text Add to dashboard Cite

This paper presents a semantic confidence measure that aims to predict the relevance of automatic transcripts for a task of Spoken Document Retrieval (SDR). The proposed predicting method relies on the combination of Automatic Speech Recognition (ASR) confidence measure and a Semantic Compacity Index (SCI), that estimates the relevance of the words considering the semantic context in which they occurred. Experiments are conducted on the French Broadcast news corpus ESTER, by simulating a classical SDR usage scenario : users submit text-queries to a search engine that is expected to return the most relevant documents regarding the query. Results demonstrate the interest of using semantic level information to predict the transcription indexability.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Grégory Senay

Unsupervised face identification in TV content using audio-visual sources

Technical Improvements of the E-HMM Based Speaker Diarization System for Meeting Records

Person name recognition in ASR outputs using continuous context models

A segment-level confidence measure for Spoken Document Retrieval

Contact Info

Product

Resources

About