“…Index Terms: multi-talker speech separation, permutation invariant training, latency-controlled BLSTM, speaker tracing I. INTRODUCTION Many advancements have been observed for monaural multi-talker speech separation [1], [2], [3], [4], [5], [6], [7], [8], [9], known as cocktail party problem [10], which is meaningful to many practical applications, such as humanmachine interaction, automatic meeting transcription etc. With the development of deep learning [11], a lot of innovations have been proposed, such as deep clustering [3], [4], deep attractor network [5], time-domain audio separation network [6], [9] and permutation invariant training (PIT) [7], [8].…”