Chen-Ming Pan scite author profile

Chen-Ming Pan

5Publications

21Citation Statements Received

54Citation Statements Given

How they've been cited

How they cite others

Affiliations

Chunghwa Telecom (Taiwan), National Taiwan University

Publications

Order By: Most citations

Semantic-event based analysis and segmentation of wedding ceremony videos

Cheng

Chuang

Chen

et al. 2007

View full text Add to dashboard Cite

Wedding is one of the most important ceremonies in our lives. It symbolizes the birth and creation of a new family. In this paper, we present a system for automatically segmenting a wedding ceremony video into a sequence of recognized wedding events, e.g., the couple's wedding kiss. Our goal is to develop an automatic tool for users to efficiently organize, search, and retrieve his/her treasured wedding memories. Furthermore, the event descriptions could benefit and complement the current research in semantic video understanding. Technically, three kinds of event features, i.e., the speech/music discriminator, flashlight detector, and bride indicator, are exploited to build statistical models for each wedding event. Events are then recognized by a hidden Markov model, which takes into account both the fitness of observed features and the temporal rationality of event ordering to improve the segmentation accuracy. We conducted experiments on a rich set of wedding videos, and the results demonstrate the effectiveness of our approach.

show abstract

NTU TRECVID-2007 fast rushes summarization system

Pan

Chuang

Hsu

2007

View full text Add to dashboard Cite

Rushes are the raw materials used to produce a video. They often contain redundant and repetitive contents. Rushes summarization aims to provide a quick overview for a rushes video. As part of TRECVID 2007, NIST initiates a rushes summarization task. This paper reports on the design of NTU rushes summarization system for this task. Our system consists of three components, shot segmentation, redundant shot detection and summary creation. To tackle the bulky rushes, we focus on efficient but effective feature representations (local color histograms and compressed-domain motion vectors) and summarization methods. In addition, we proposed a novel approach to detect clapper shots which are not only relevant to concise summarizes but also essential for indexing numerous camera takes in the rushes. Even practically efficient and requiring only 40% of the video time for computation, the proposed system achieved satisfying results in TRECVID 2007 rushes summarization task.

show abstract

An investigation on linguistic features for Mandarin prosody generation

Hung

Yeh

Liao

et al. 2014

View full text Add to dashboard Cite

Punctuation-generation-inspired linguistic features for Mandarin prosody generation

Chiang

Hung

Yeh

et al. 2019

J AUDIO SPEECH MUSIC PROC.

View full text Add to dashboard Cite

This paper proposes two novel linguistic features extracted from text input for prosody generation in a Mandarin text-to-speech system. The first feature is the punctuation confidence (PC), which measures the likelihood that a major punctuation mark (MPM) can be inserted at a word boundary. The second feature is the quotation confidence (QC), which measures the likelihood that a word string is quoted as a meaningful or emphasized unit. The proposed PC and QC features are influenced by the properties of automatic Chinese punctuation generation and linguistic characteristic of the Chinese punctuation system. Because MPMs are highly correlated with prosodicacoustic features and quoted word strings serve crucial roles in human language understanding, the two features could potentially provide useful information for prosody generation. This idea was realized by employing conditional random-field-based models for predicting MPMs, quoted word string locations, and their associated confidences-that is, PC and QC-for each word boundary. The predicted punctuations and their confidences were then combined with traditional linguistic features to predict prosodic-acoustic features for performing speech synthesis using multilayer perceptrons. Both objective and subjective tests demonstrated that the prosody generated with the proposed linguistic features was superior to that generated without the proposed features. Therefore, the proposed PC and QC are identified as promising features for Mandarin prosody generation.

show abstract

Personalized Taiwanese Speech Synthesis using Cascaded ASR and TTS Framework

Liao

Hsu

Pan

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chen-Ming Pan

Semantic-event based analysis and segmentation of wedding ceremony videos

NTU TRECVID-2007 fast rushes summarization system

An investigation on linguistic features for Mandarin prosody generation

Punctuation-generation-inspired linguistic features for Mandarin prosody generation

Personalized Taiwanese Speech Synthesis using Cascaded ASR and TTS Framework

Contact Info

Product

Resources

About