The automatic transcription of music recordings with the objective to derive as core-liker epresentation from a givenaudio representation is afundamental and challenging task. In particular for polyphonic music recordings with overlapping sound sources, current transcription systems still have problems to accurately extract the parameters of individual notes specified by pitch, onset, and duration. In this article, we present amusic transcription system that is carefully designed to cope with various facets of music. One main idea of our approach is to consistently employam id-levelr epresentation that is based on am usically meaningful pitch scale. To achieve the necessary spectral and temporal resolution, we use amulti-resolution Fourier transform enhanced by an instantaneous frequencye stimation. Subsequently,h aving extracted pitch and note onset information from this representation, we employHidden Markov Models (HMM)for determining the note events in acontext-sensitive fashion. As another contribution, we evaluate our transcription system on an extensive dataset containing audio recordings of various genre. Here, opposed to manyp revious approaches, we do not only rely on synthetic audio material, bute valuate our system on real audio recordings using MIDI-audio synchronization techniques to automatically generate reference annotations. PACS no. 43.75.Xz, 43.75.zz ACTA ACUSTICA UNITED WITH ACUSTICA Grosche et al.:A utomatic transcription of music Vol. 98 (2012) ACTA ACUSTICA UNITED WITH ACUSTICA Vol. 98 (2012)