Speech perception is characterized by categorical perception of phonemes. Certain speech sounds varying along the continuous acoustic dimension known as voice onset time (VOT) are perceived as either voiced /b/ or unvoiced /p/ phonemes by English listeners. A third VOT prevoiced /p h / phoneme category is used in Thai and is indistinct from the /b/ category in English. Some listeners can learn to perceive speech sounds belonging this third VOT category with a small amount of training. The cognitive mechanisms underlying the variation in individualsʼ ability to perceive the prevoiced /p h / phoneme are not well understood. The current experiment investigated the role of attention and working-memory in facilitating listeners' ability to learn to perceive the prevoiced /p h / phoneme. A consistent relationship between attentional and workingmemory measures and prevoiced perceptual learning was not found. Musical ability was a good predictor of performance on a prevoiced phoneme identification task.