Humans move their upper limbs for communicative purposes during speaking. They gesture. Such movements interact on multiple levels with speaking. In connection to what is said, gestures meaningfully shape with varying means of representation. Yet, gestures also have non-representational aspects; they quasi-rhythmically pulse with prosodic structure in speech. In explaining how modern human gesturing practices emerge in phylogeny or ontogeny, it is undisputed that gestures proliferated because they provide particularly effective means to refer to absent or distal state of affairs. It suggested that displaced or deictic reference is gestures' most basic proper function. The upshot is that the non-representational pulsing quality of gesture is completely ignored as something a) that requires an explanation or b) something that can elucidate how gesture practices emerged from more basic beginnings shared with other animals. However, recent research provides evidence for direct biomechanical interaction between pulsing manual movements and respiratory-vocal activity. We argue that this physical link is enacted during infant vocal-motor babbling - way before infants learn to represent manually. Further, we argue that gesture-vocal biomechanics directly relates to the cross-species phenomenon of locomotor-respiratory(-vocal) coupling. Given that gesture-speech biomechanics has its roots in locomotor-respiratory coupling, it can be related to bipedalism and respiratory complexification, i.e., an adaptation for the faculty of speech. We conclude that the physical origins of vocal-entangled gesture run much deeper and unfolded more gradually than currently assumed. The entanglement of sound and movement arose out of natural physical coalitions between vocal, respiratory, and limb systems that are forced to interact. We thus invert current argumentation of how gesture and vocalization must have evolved and rethink what is foundational of human gesture. This perspective underlines that a more comprehensive investigation of the physical basis of bodily communication can yield new sources of semiotic significance in human and non-human animals.