Junzhe Zhu scite author profile

We propose an end-to-end trainable approach to singlechannel speech separation with unknown number of speakers. Our approach extends the MulCat source separation backbone with additional output heads: a count-head to infer the number of speakers, and decoder-heads for reconstructing the original signals. Beyond the model, we also propose a metric on how to evaluate source separation with variable number of speakers. Specifically, we clear up the issue on how to evaluate the quality when the ground-truth has more or less speakers than the ones predicted by the model. We evaluate our approach on the WSJ0-mix datasets, with mixtures up to five speakers. We demonstrate that our approach outperforms state-of-the-art in counting the number of speakers and remains competitive in quality of reconstructed signals.

show abstract

An Efficient and Fast VLIW Compression Scheme for Stream Processor

Hou

Zhu

2020

IEEE Access

View full text Add to dashboard Cite

Stream processor has been widely used in multimedia processing because of the high performance gained by parallelism. In order to achieve higher parallelism, the stream processor employs large width structure of VLIW (very long instruction word, VLIW) and multiple parallelizable instructions are organized into one VLIW. Because the width of VLIW is fixed, there are a large number of empty operations (non-operation, NOP) filled in VLIW, which results in serious code size expansion problem. Aiming at this issue, the horizontal code compression and vertical code compression methods are applied on the VLIW of stream processor respectively. First the VLIW is divided into several subfields according to the logic characteristics of VLIW instruction, then the horizontal code compression scheme which based on Huffman coding is applied on each subfield and this method can achieve approximately 78% code size reduction on average. However, the extra-long time required to decode the compressed VLIW before instruction execution may cause system performance penalty. In order to reduce the decompression time consumption, the vertical compression scheme is proposed. The vertical compression can reduce the code size nearly 70% by deleting the NOPs of VLIW in vertical direction. Furthermore the VLIW after vertical compression can be executed directly without decompression operation by using banked instruction memory. Specifically, the vertical compression can compress stream processor VLIW code size significantly and without any negative influence on performance.

show abstract

Identify Speakers in Cocktail Parties with End-to-End Attention

Zhu¹,

Hasegawa‐Johnson²,

Sarı³

2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Junzhe Zhu

Fair hierarchical secret sharing scheme based on smart contract

Multi-Decoder Dprnn: Source Separation for Variable Number of Speakers

An Efficient and Fast VLIW Compression Scheme for Stream Processor

Identify Speakers in Cocktail Parties with End-to-End Attention

Contact Info

Product

Resources

About