Interpersonal musical entrainment—temporal synchronization and coordination between individuals in musical contexts—is a ubiquitous phenomenon related to music’s social functions of promoting group bonding and cohesion. Mechanisms other than sensorimotor synchronization are rarely discussed, while little is known about cultural variability or about how and why entrainment has social effects. In order to close these gaps, we propose a new model that distinguishes between different components of interpersonal entrainment: sensorimotor synchronization—a largely automatic process manifested especially with rhythms based on periodicities in the 100–2000 ms timescale—and coordination, extending over longer timescales and more accessible to conscious control. We review the state of the art in measuring these processes, mostly from the perspective of action production, and in so doing present the first cross-cultural comparisons between interpersonal entrainment in natural musical performances, with an exploratory analysis that identifies factors that may influence interpersonal synchronization in music. Building on this analysis we advance hypotheses regarding the relationship of these features to neurophysiological, social, and cultural processes. We propose a model encompassing both synchronization and coordination processes and the relationship between them, the role of culturally shared knowledge, and of connections between entrainment and social processes.
The measurement and tracking of body movement within musical performances can provide valuable sources of data for studying interpersonal interaction and coordination between musicians. The continued development of tools to extract such data from video recordings will offer new opportunities to research musical movement across a diverse range of settings, including field research and other ecological contexts in which the implementation of complex motion capture (MoCap) systems is not feasible or affordable. Such work might also make use of the multitude of video recordings of musical performances that are already available to researchers. This study made use of such existing data, specifically, three video datasets of ensemble performances from different genres, settings, and instrumentation (a pop piano duo, three jazz duos, and a string quartet). Three different computer vision techniques were applied to these video datasets-frame differencing, optical flow, and kernelized correlation filters (KCF)-with the aim of quantifying and tracking movements of the individual performers. All three computer vision techniques exhibited high correlations with MoCap data collected from the same musical performances, with median correlation (Pearson's r) values of 0.75-0.94. The techniques that track movement in two dimensions (optical flow and KCF) provided more accurate measures of movement than a technique that provides a single estimate of overall movement change by frame for each performer (frame differencing). Measurements of performer's movements were also more accurate when the computer vision techniques were applied to more narrowly defined regions of interest (head) than when the same techniques were applied to larger regions (entire upper body, above the chest, or waist). Some differences in movement tracking accuracy emerged between the three video datasets, which may have been due to instrument-specific motions that resulted in occlusions of the body part of interest (e.g., a violinist's right hand occluding the head while tracking head movement). These results indicate that computer vision techniques can be effective in quantifying body movement from videos of musical performances, while also highlighting constraints that must be dealt with when applying such techniques in ensemble coordination research.
Full-body human movement is characterized by fine-grain expressive qualities that humans are easily capable of exhibiting and recognizing in others' movement. In sports (e.g., martial arts) and performing arts (e.g., dance), the same sequence of movements can be performed in a wide range of ways characterized by different qualities, often in terms of subtle (spatial and temporal) perturbations of the movement. Even a non-expert observer can distinguish between a top-level and average performance by a dancer or martial artist. The difference is not in the performed movements-the same in both cases-but in the "quality" of their performance.In this article, we present a computational framework aimed at an automated approximate measure of movement quality in full-body physical activities. Starting from motion capture data, the framework computes low-level (e.g., a limb velocity) and high-level (e.g., synchronization between different limbs) movement features. Then, this vector of features is integrated to compute a value aimed at providing a quantitative assessment of movement quality approximating the evaluation that an external expert observer would give of the same sequence of movements. Next, a system representing a concrete implementation of the framework is proposed. Karate is adopted as a testbed. We selected two different katas (i.e., detailed choreographies of movements in karate) characterized by different overall attitudes and expressions (aggressiveness, meditation), and we asked seven athletes, having various levels of experience and age, to perform them. Motion capture data were collected from the performances and were analyzed with the system. The results of the automated analysis were compared with the scores given by 14 karate experts who rated the same performances. Results show that the movementquality scores computed by the system and the ratings given by the human observers are highly correlated (Pearson's correlations r = 0.84, p = 0.001 and r = 0.75, p = 0.005).
This text field is large enough to hold the appropriate release statement assuming it is single spaced in a sans-serif 7 point font. Every submission will be assigned their own unique DOI string to be included here.
In this paper we present three studies focusing on the effect of different sound models in interactive sonification of bodily movement. We hypothesized that a sound model characterized by continuous smooth sounds would be associated with other movement characteristics than a model characterized by abrupt variation in amplitude and that these associations could be reflected in spontaneous movement characteristics. Three subsequent studies were conducted to investigate the relationship between properties of bodily movement and sound: (1) a motion capture experiment involving interactive sonification of a group of children spontaneously moving in a room, (2) an experiment involving perceptual ratings of sonified movement data and (3) an experiment involving matching between sonified movements and their visualizations in the form of abstract drawings. In (1) we used a system constituting of 17 IR cameras tracking passive reflective markers. The head positions in the horizontal plane of 3–4 children were simultaneously tracked and sonified, producing 3–4 sound sources spatially displayed through an 8-channel loudspeaker system. We analyzed children's spontaneous movement in terms of energy-, smoothness- and directness-index. Despite large inter-participant variability and group-specific effects caused by interaction among children when engaging in the spontaneous movement task, we found a small but significant effect of sound model. Results from (2) indicate that different sound models can be rated differently on a set of motion-related perceptual scales (e.g., expressivity and fluidity). Also, results imply that audio-only stimuli can evoke stronger perceived properties of movement (e.g., energetic, impulsive) than stimuli involving both audio and video representations. Findings in (3) suggest that sounds portraying bodily movement can be represented using abstract drawings in a meaningful way. We argue that the results from these studies support the existence of a cross-modal mapping of body motion qualities from bodily movement to sounds. Sound can be translated and understood from bodily motion, conveyed through sound visualizations in the shape of drawings and translated back from sound visualizations to audio. The work underlines the potential of using interactive sonification to communicate high-level features of human movement data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.