“…For constructing the pre-extracted dictionary P (f |q, c, s), the IEEE DCASE Event Detection training dataset was used [7,1]. The dataset contains isolated sounds recorded in an office environment at Queen Mary University of London, and covers 16 event classes (s ∈ {1, ..., 16}): alert, clearing throat, cough, door slam, drawer, keyboard click, keys, door knock, laughter, mouse click, page turn, pen drop, phone, printer, speech, and switch.…”