“…The principle is then to present two stimuli to participants, where one is animated using reference data (e.g., a motion capture) and the other one is either a modified version or a newly generated animation, and asked the participants about the similarity between the two versions. In some user studies [JADJ22,HTMS04,Dur21,MRW*23], the participants were asked to rate the level of similarity between the stimuli using a Likert scale such as “Choose an option: exactly same, very similar, moderately similar, similar, slightly similar, not similar” [Dur21]. In other user studies [HOT98, JHO10, WB04,RSM*23,HRVDP04,TLKS08], the participants were asked to respond to one or more binary questions such as whether a “pair of two motions were the same or different” [HOT98], “in which clip was the motion of better quality?” [JHO10], “whether the first or second motion of a pair was more natural” [WB04], “which of two presentations of the arm contained a change” [HRVDP04], “judge whether two given postures appear similar or not” [TLKS08], and “which animation was the hybrid?” [RSM*23].…”