2022
DOI: 10.1007/s11042-022-11994-1
|View full text |Cite
|
Sign up to set email alerts
|

Speech and music separation approaches - a survey

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 77 publications
0
5
0
Order By: Relevance
“…The possible speed and correct computing of big data, careful control of data propagation, and proper monitoring of information diffusion in various large complex networks for modern text, visible image, and acoustic data types naturally increase the time complexity and memory usage for big text data [78], big image data [6], big acoustic data [69], and the possible combination of exclusive video and audio for multimedia applications in uncertain and high-risk environments [14], different topology streaming [15], and various channel utilization [16] for automatic decision-making. This potential problem routinely requires modern definitions and economic modeling for the subsequent definition of big image data generation in a reliable form suitable for various online interactions of intelligent multimedia applications, from camera imaging to knowledge extraction of sequential images [17,5].…”
Section: Big Data Oceansmentioning
confidence: 99%
See 1 more Smart Citation
“…The possible speed and correct computing of big data, careful control of data propagation, and proper monitoring of information diffusion in various large complex networks for modern text, visible image, and acoustic data types naturally increase the time complexity and memory usage for big text data [78], big image data [6], big acoustic data [69], and the possible combination of exclusive video and audio for multimedia applications in uncertain and high-risk environments [14], different topology streaming [15], and various channel utilization [16] for automatic decision-making. This potential problem routinely requires modern definitions and economic modeling for the subsequent definition of big image data generation in a reliable form suitable for various online interactions of intelligent multimedia applications, from camera imaging to knowledge extraction of sequential images [17,5].…”
Section: Big Data Oceansmentioning
confidence: 99%
“…Big image data streams have become ubiquitous because a considerable number of online multimedia applications naturally generate massive amounts of various types of data at an incredible velocity in 2D and 3D forms. Multimedia applications combine different data types in text, speech, sound, music, image, and video formats [5]. This work has to be directly managed by new devices in big data streams because of the built-in dynamic characteristics of different types of data, with an incredible speed of presented mining tools, applied technologies, designed methods, heterogeneous hardware, and hybrid techniques from starting data construction to ending useful information production for reasonable speed of knowledge extraction in decision-making at various data stream network levels [6,7].…”
Section: Introductionmentioning
confidence: 99%
“…The recent separation techniques, however, fall well short of the capabilities of human hearing. It is challenging to resolve the existing SVS because of the instruments utilized and the spectral overlap between the speech and background music [ 11 , 18 , 19 , 20 , 21 ]. In daily life, human listeners generally have the remarkable ability to distinguish sound streams from a mixture of sounds, but this continues to be a difficult task for machines, particularly in the monaural case because it lacks the spatial cues that can be learned when two or more microphones are used.…”
Section: Introductionmentioning
confidence: 99%
“…Separation of speech, music and environmental sounds is an important task for many speech applications and automatic machine hearing, such as the automatic speech recognition and music applications in edge devices [1]. Its quality has been significantly improved with the introduction of deep learning.…”
Section: Introductionmentioning
confidence: 99%