The aim of this article is to mount a challenge to gesture-first hypotheses about the evolution of language by identifying constraints on the emergence of symbol use. Current debates focus on a range of pre-conditions for the emergence of language, including co-operation and related mentalising capacities, imitation and tool use, episodic memory, and vocal physiology, but little specifically on the ability to learn and understand symbols. It is argued here that such a focus raises new questions about the plausibility of gesture-first hypotheses, and so about the evolution of language in general. After a brief review of the methodology used in the article, it is argued that existing uses of gesture in hominid communities may have prohibited the emergence of symbol use, rather than 'bootstrapped' symbolic capacities as is usually assumed, and that the vocal channel offers other advantages in both learning and using language. In this case, the vocal channel offers a more promising platform for the evolution of language than is often assumed.Many thanks to Kim Sterelny and the philosophy of biology group at ANU for their help throughout this project, to numerous audiences at EvoLang, the AAP and elsewhere for their valuable feedback, to Matteo Colombo and anonymous referees for their comments, and to Sean Roberts for ongoing discussions. 1 This research makes it possible to make a case for gesture-first hypotheses without relying on controversial mirror-neurons (Cook et al., 2014;Heyes, 2010aHeyes, , 2010b, and is perhaps best developed by Tomasello and Sterelny.