Jane W. Chang scite author profile

Jane W. Chang

5Publications

36Citation Statements Received

96Citation Statements Given

How they've been cited

114

How they cite others

Affiliations

University of Cincinnati, Massachusetts Institute of Technology

Publications

Order By: Most citations

A probabilistic framework for feature-based speech recognition

Glass¹,

Chang²,

McCandless³

View full text Add to dashboard Cite

Most current speech recognizers use an observation space which is based on a temporal sequence of "frames" (e.g., Mel-cepstra). There is another class of recognizer which further processes these frames to produce a segment-based network, and represents each segment by fixed-dimensional "features." In such feature-based recognizers the observation space takes the form of a temporal network of feature vectors, so that a single segmentation of an utterance will use a subset of all possible feature vectors. In this work we examine a maximum a posteriori decoding strategy for feature-based recognizers and develop a normalization criterion useful for a segmentbased Viterbi or A search. We report experimental results for the task of phonetic recognition on the TIMIT corpus where we achieved context-independent and context-dependent (using diphones) results on the core test set of 64.1% and 69.5% respectively.

show abstract

A study of speech recognition system robustness to microphone variations: experiments in phonetic classification

Chang¹,

Zue²

1994

View full text Add to dashboard Cite

Video Scene Detection Using Transformer Encoding Linker Network (TELNet)

Tseng

Yeh

et al. 2023

Sensors

View full text Add to dashboard Cite

This paper introduces a transformer encoding linker network (TELNet) for automatically identifying scene boundaries in videos without prior knowledge of their structure. Videos consist of sequences of semantically related shots or chapters, and recognizing scene boundaries is crucial for various video processing tasks, including video summarization. TELNet utilizes a rolling window to scan through video shots, encoding their features extracted from a fine-tuned 3D CNN model (transformer encoder). By establishing links between video shots based on these encoded features (linker), TELNet efficiently identifies scene boundaries where consecutive shots lack links. TELNet was trained on multiple video scene detection datasets and demonstrated results comparable to other state-of-the-art models in standard settings. Notably, in cross-dataset evaluations, TELNet demonstrated significantly improved results (F-score). Furthermore, TELNet’s computational complexity grows linearly with the number of shots, making it highly efficient in processing long videos.

show abstract

Segmentation and modeling in segment-based recognition

Chang

Glass²

1997

View full text Add to dashboard Cite

A study of speech recognition system robustness to microphone variations

Chang¹,

Zue²

1995

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jane W. Chang

A probabilistic framework for feature-based speech recognition

A study of speech recognition system robustness to microphone variations: experiments in phonetic classification

Video Scene Detection Using Transformer Encoding Linker Network (TELNet)

Segmentation and modeling in segment-based recognition

A study of speech recognition system robustness to microphone variations

Contact Info

Product

Resources

About