SUMMARYParallels have been reported between broad organization in the auditory system and optimized artificial neural networks1–3. It remains to be seen whether such promising analogies between the auditory system and deep learning models endure at other levels of description. Here, we examined whether artificial neural networks4,5 could offer a mechanistic account of human behavior in an auditory task. The chosen task promoted the use of binaural cues (across the ears) to help detect a signal in noise6,7. In the optimal network, we observed the emergence of specialized computations with prominent similarities to in vivo animal data8. Artificial neurons developed a sensitivity to temporal delays that increased hierarchically, and were widely distributed in preference (extending to delays beyond the range permitted by head width). Ensuing dynamics were consistent with a binaural cross-correlation mechanism9. Given the neural mechanisms of binaural detection in humans is contested9–13, these findings help to resolve this debate. Moreover, this is a primary demonstration that deep learning can infer tangible mechanisms underlying auditory perception.