Speech Recognizing Comparisons Between Web Speech API and FPT.AI API

Chung, Tran Duc; Nguyen, Duc Long; Ha, Hong Son; Hassan, Mohd Fadzil

doi:10.1007/978-981-16-2406-3_64

Lecture Notes in Electrical Engineering

2021

DOI: 10.1007/978-981-16-2406-3_64

|View full text |Cite

Speech Recognizing Comparisons Between Web Speech API and FPT.AI API

Tran Duc Chung

Duc Long Nguyen

Hong Son Ha

et al.

Abstract: Nowadays, people use speech recognition services for many purposes in their daily lives, such as learning foreign languages, communicating, etc. Therefore, they need to decide which ones to use. High accuracy and short processing time speech recognition service will help improve the work effectively as the time to recheck output results and the delay time between recognition tasks. For Vietnamese speech recognition, Web Speech API and FPT.AI API are popular. Web Speech API supports multiple languages, while FP… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2022

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Tạo Phụ Đề Video Dựa Trên Kỹ Thuật Nhận Dạng Giọng Nói: Thử Nghiệm Cho Một Số Chương Trình Tại VTV

Nguyễn¹,

Võ²,

Trần³

2022

JTE

View full text Add to dashboard Cite

Bài báo này trình bày kết quả thử nghiệm công cụ nhận dạng giọng nói Speech-To-Text (STT) cho các nội dung VOD (Video On Demand) trên hệ thống VTVgo của Đài THVN. Để đánh giá độ chính xác của công cụ STT, tỷ lệ lỗi từ (WER: Word Error Rate) được sử dụng để đo hiệu suất của hệ thống nhận dạng giọng nói tự động, dịch máy. Kết quả thử nghiệm thực hiện 10 thể loại chương trình truyền hình khác nhau với 1065 giờ video. Tỉ lệ WER thấp nhất là 2.8% đến 4.3% đạt được với một số thể loại chương trình thời sự và tin tức, dự báo thời tiết, ở đó phần lớn người nói, người dẫn chương trình (MC) đọc giọng chuẩn trong Studio và lời thoại từ một người nói, ít bị nhiễu bởi tạp âm bên ngoài. Bên cạnh đó, để minh họa ứng dụng phụ đề video, chúng tôi tiến hành thử nghiệm trên hệ thống VTVgo, tích hợp công cụ hiển thị phụ đề tùy chọn vào ứng dụng VTVgo app. Nền tảng thử nghiệm là SmartTV và SmartPhone Android, nhằm minh họa khả năng ứng dụng phụ đề video trên nền tảng phân phối nội dung số OTT (Over The Top).

show abstract

Tạo Phụ Đề Video Dựa Trên Kỹ Thuật Nhận Dạng Giọng Nói: Thử Nghiệm Cho Một Số Chương Trình Tại VTV

Nguyễn¹,

Võ²,

Trần³

2022

JTE

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Speech Recognizing Comparisons Between Web Speech API and FPT.AI API

Cited by 1 publication

References 24 publications

Tạo Phụ Đề Video Dựa Trên Kỹ Thuật Nhận Dạng Giọng Nói: Thử Nghiệm Cho Một Số Chương Trình Tại VTV

Tạo Phụ Đề Video Dựa Trên Kỹ Thuật Nhận Dạng Giọng Nói: Thử Nghiệm Cho Một Số Chương Trình Tại VTV

Contact Info

Product

Resources

About