“…Diarization is the task of automatically identifying sections of spoken audio and correctly labeling them with their characteristics, for example, speech, non-speech, male-speech, female-speech, music, noise. Although speaker identification played a role in early segmentation approaches, e.g., [300], determination of the identity of the speaker, called speaker identification, or confirmation of a presumed speaker identity, called speaker verification, does not fall into the scope of the diarization task.…”