BackgroundCrowdsourcing has become a valuable method for collecting medical research data. This approach, recruiting through open calls on the Web, is particularly useful for assembling large normative datasets. However, it is not known how natural language datasets collected over the Web differ from those collected under controlled laboratory conditions.ObjectiveTo compare the natural language responses obtained from a crowdsourced sample of participants with responses collected in a conventional laboratory setting from participants recruited according to specific age and gender criteria.MethodsWe collected natural language descriptions of 200 half-minute movie clips, from Amazon Mechanical Turk workers (crowdsourced) and 60 participants recruited from the community (lab-sourced). Crowdsourced participants responded to as many clips as they wanted and typed their responses, whereas lab-sourced participants gave spoken responses to 40 clips, and their responses were transcribed. The content of the responses was evaluated using a take-one-out procedure, which compared responses to other responses to the same clip and to other clips, with a comparison of the average number of shared words.ResultsIn contrast to the 13 months of recruiting that was required to collect normative data from 60 lab-sourced participants (with specific demographic characteristics), only 34 days were needed to collect normative data from 99 crowdsourced participants (contributing a median of 22 responses). The majority of crowdsourced workers were female, and the median age was 35 years, lower than the lab-sourced median of 62 years but similar to the median age of the US population. The responses contributed by the crowdsourced participants were longer on average, that is, 33 words compared to 28 words (P<.001), and they used a less varied vocabulary. However, there was strong similarity in the words used to describe a particular clip between the two datasets, as a cross-dataset count of shared words showed (P<.001). Within both datasets, responses contained substantial relevant content, with more words in common with responses to the same clip than to other clips (P<.001). There was evidence that responses from female and older crowdsourced participants had more shared words (P=.004 and .01 respectively), whereas younger participants had higher numbers of shared words in the lab-sourced population (P=.01).ConclusionsCrowdsourcing is an effective approach to quickly and economically collect a large reliable dataset of normative natural language responses.
Information acquisition, the gathering and interpretation of sensory information, is a basic function of mobile organisms. We describe a new method for measuring this ability in humans, using free-recall responses to sensory stimuli which are scored objectively using a “wisdom of crowds” approach. As an example, we demonstrate this metric using perception of video stimuli. Immediately after viewing a 30 s video clip, subjects responded to a prompt to give a short description of the clip in natural language. These responses were scored automatically by comparison to a dataset of responses to the same clip by normally-sighted viewers (the crowd). In this case, the normative dataset consisted of responses to 200 clips by 60 subjects who were stratified by age (range 22 to 85y) and viewed the clips in the lab, for 2,400 responses, and by 99 crowdsourced participants (age range 20 to 66y) who viewed clips in their Web browser, for 4,000 responses. We compared different algorithms for computing these similarities and found that a simple count of the words in common had the best performance. It correctly matched 75% of the lab-sourced and 95% of crowdsourced responses to their corresponding clips. We validated the measure by showing that when the amount of information in the clip was degraded using defocus lenses, the shared word score decreased across the five predetermined visual-acuity levels, demonstrating a dose-response effect (N = 15). This approach, of scoring open-ended immediate free recall of the stimulus, is applicable not only to video, but also to other situations where a measure of the information that is successfully acquired is desirable. Information acquired will be affected by stimulus quality, sensory ability, and cognitive processes, so our metric can be used to assess each of these components when the others are controlled.
Gaze-contingent displays combine a display device with an eyetracking system to rapidly update an image on the basis of the measured eye position. All such systems have a delay, the system latency, between a change in gaze location and the related change in the display. The system latency is the result of the delays contributed by the eyetracker, the display computer, and the display, and it is affected by the properties of each component, which may include variability. We present a direct, simple, and low-cost method to measure the system latency. The technique uses a device to briefly blind the eyetracker system (e.g., for video-based eyetrackers, a device with infrared light-emitting diodes (LED)), creating an eyetracker event that triggers a change to the display monitor. The time between these two events, as captured by a relatively low-cost consumer camera with high-speed video capability (1,000 Hz), is an accurate measurement of the system latency. With multiple measurements, the distribution of system latencies can be characterized. The same approach can be used to synchronize the eye position time series and a video recording of the visual stimuli that would be displayed in a particular gaze-contingent experiment. We present system latency assessments for several popular types of displays and discuss what values are acceptable for different applications, as well as how system latencies might be improved.
Gaze-contingent display paradigms play an important role in vision research. The time delay due to data transmission from eye tracker to monitor may lead to a misalignment between the gaze direction and image manipulation during eye movements, and therefore compromise the contingency. We present a method to reduce this misalignment by using a compressed exponential function to model the trajectories of saccadic eye movements. Our algorithm was evaluated using experimental data from 1,212 saccades ranging from 3° to 30°, which were collected with an EyeLink 1000 and a Dual-Purkinje Image (DPI) eye tracker. The model fits eye displacement with a high agreement (R² > 0.96). When assuming a 10-millisecond time delay, prediction of 2D saccade trajectories using our model could reduce the misalignment by 30% to 60% with the EyeLink tracker and 20% to 40% with the DPI tracker for saccades larger than 8°. Because a certain number of samples are required for model fitting, the prediction did not offer improvement for most small saccades and the early stages of large saccades. Evaluation was also performed for a simulated 100-Hz gaze-contingent display using the prerecorded saccade data. With prediction, the percentage of misalignment larger than 2° dropped from 45% to 20% for EyeLink and 42% to 26% for DPI data. These results suggest that the saccade-prediction algorithm may help create more accurate gaze-contingent displays.
The presence of information in a visual display does not guarantee its use by the visual system. Studies of inversion effects in both face recognition and biological-motion perception have shown that the same information may be used by observers when it is presented in an upright display but not used when the display is inverted. In our study, we tested the inversion effect in scrambled biological-motion displays to investigate mechanisms that validate information contained in the local motion of a point-light walker. Using novel biological-motion stimuli that contained no configural cues to the direction in which a walker was facing, we found that manipulating the relative vertical location of the walker's feet significantly affected observers' performance on a direction-discrimination task. Our data demonstrate that, by themselves, local cues can almost unambiguously indicate the facing direction of the agent in biological-motion stimuli. Additionally, we document a noteworthy interaction between local and global information and offer a new explanation for the effect of local inversion in biological-motion perception.
Directional information can be retrieved from a point-light walker (PLW) in two different ways: either from recovering the global shape of the articulated body or from signals in the local motion of individual dots. Here, we introduce a voluntary eye movement task to assess how the direction of a centrally presented, task-irrelevant PLW affects the onset latency and accuracy of saccades to peripheral targets. We then use this paradigm to design experiments to study which aspects of biological motion-the global form mediated by the motion of the walker or the local movements of critical features-drive the observed attentional effects. Putting the two cues into conflict, we show that saccade latency and accuracy were affected by the local motion of the dots representing the walker's feet-but only if they retain their familiar, predictable location within the display.
People working together on a task must often represent the goals and salient items of their partner. The aim of the present study was to study the influence of joint task representations in an interference task in which the congruency relies on semantic identity. If task representations are shared between partners in a joint Stroop task (co-representation account), we hypothesized that items in the response set of one partner might influence performance of the other. In Experiment 1, pairs of participants sat side by side. Each participant was instructed to press one of two buttons to indicate which of two colors assigned to them was present, ignoring the text and responding only to the pixel color. There were three types of incongruent distractor words: names of colors from their own response set, names of colors from the other partner's response set, and neutral words for colors not used as font colors. The results of Experiment 1 showed that when people were doing this task together, distractor words from the partner's response set interfered more than neutral words and just as much as the words from their own response color set. However, in three follow-up experiments (Experiments 2a, 2b, and 2c), we found an elevated interference for the other response-set words even though no co-actor was present. The overall pattern of results across our study suggests that an alternative response set, regardless of whether it belonged to a co-actor or to a non-social no-go condition, evoked equal amounts of interference comparable to those of the own response set. Our findings are in line with a theory of common coding, in which all events-irrespective of their social nature-are represented and can influence behavior.
Humans can perceive many properties of a creature in motion from the movement of the major joints alone. However it is likely that some regions of the body are more informative than others, dependent on the task. We recorded eye movements while participants performed two tasks with point-light walkers: determining the direction of walking, or determining the walker's gender. To vary task difficulty, walkers were displayed from different view angles and with different degrees of expressed gender. The effects on eye movement were evaluated by generating fixation maps, and by analyzing the number of fixations in regions of interest representing the shoulders, pelvis, and feet. In both tasks participants frequently fixated the pelvis region, but there were relatively more fixations at the shoulders in the gender task, and more fixations at the feet in the direction task. Increasing direction task difficulty increased the focus on the foot region. An individual's task performance could not be predicted by their distribution of fixations. However by showing where observers seek information, the study supports previous findings that the feet play an important part in the perception of walking direction, and that the shoulders and hips are particularly important for the perception of gender.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.