Stimulus-driven or reactive control refers to the modulation of attention poststimulus onset via retrieval of learned control settings associated with task stimuli. The present study asked which stimulus-driven control setting "wins" the competition when more than 1 is available to guide attention. Utilizing an item-specific proportion congruence manipulation in a picture-word Stroop task, 7 experiments examined competition between item-level and category-level control settings. In Experiment 1, category-level control dominated as evidenced by transfer of control to unique 50% congruent items (exemplars) from biased (33% or 67% congruent) animal categories. In Experiment 2, the dominance persisted-transfer was observed even for inconsistent transfer items (e.g., 83% congruent bird from a 33% congruent bird category). Recategorization of the exemplars prior to the Stroop task (Experiment 3a) successfully shifted the dominance to item-level control as did changing the Stroop task goal (Experiment 4a); however, exposure to the exemplars (Experiment 3b) and individuation training prior to the Stroop task did not (Experiments 3c and 4b). These novel findings suggest category-level control dominates in guiding attention poststimulus onset, but this dominance is dependent on contextual features (i.e., mutable). We propose a salience account of dominance and discuss implications for item-based computational models. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Prominent models of control assume that conflict and the probability of conflict are signals used by control processes that regulate attention. For example, when conflict is frequent across preceding trials (i.e., high probability of conflict), control processes bias attention toward goal-relevant information on subsequent trials. An important but underspecified question regards the metacontrol property of timescale-that is, how far back does the control system "look" to determine the probability of conflict? To address this question, Aben, Verguts, and Van den Bussche ( 2017) developed a statistical model quantifying the timescale of control. In a flanker task, they observed short timescales for lists with a low probability of conflict (which induce reactive control) and long timescales for lists with a high probability of conflict (which induce proactive control). To investigate the domain generality of these timescales, we applied their model to two additional conflict tasks that manipulated the list-wide probability of conflict. Our findings replicated Aben et al. suggesting meta-control may be task general with respect to timescales operating on the list level. We subsequently modified their model to examine timescale differences for items in the same list that differed in their probability of conflict but not the type of control engaged. We failed to detect a difference in timescales between items. Collectively, the findings demonstrate that differences in the timescale of control are task general and suggest that timescale differences are driven by the type of control engaged and not by the probability of conflict per se.
Much research has shown that humans can allocate attentional control differentially to multiple locations based on the amount of conflict historically associated with a given location. Additionally, once established, these control settings can transfer to nearby locations that themselves have no conflict bias. Here we examined if these control settings also extend to nearby locations that are presented outside of the original frame of reference of biased stimuli. During training, participants first responded to biased flanker stimuli that were likely high conflict in one location and low conflict in another location. Then they were exposed to two sets of unbiased stimuli presented in novel transfer locations outside of the established reference frame of biased stimuli. Across three experiments, attentional control settings transferred beyond the reference frame including when there was a visual border (Experiment 2) or meaningful categorical distinction (Experiment 3) delineating training and transfer locations. These novel findings further support the idea that stimulus-driven attention control can be flexibly allocated, perhaps in a categorical manner.
In the presented work, we explore some of the challenges in recognizing children's speech on automatic speech recognition (ASR) systems developed using adults' speech. In such mismatched ASR tasks, a severely degraded recognition performance is observed due to the gross mismatch in the acoustic attributes between those two groups of speakers. Among the various sources of mismatch, we focus on the large differences in the average pitch values across the adult and child speakers in this work. Earlier studies have shown that the Mel-filterbank employed in the feature extraction is not able to smooth out the pitch harmonics sufficiently in particularly for the high-pitched child speakers. As a result of that, the acoustic features derived for the adult and the child speakers turn out to be significantly mismatched. For addressing this problem, we propose a simple technique based on adaptive-liftering for deriving the pitch-robust features. This enables us to reduce the sensitivity of the acoustic features to the gross variations in pitch across the speakers. The proposed features are found to result in improved performance in the context of deep neural network based ASR system. Further with the use of the existing feature normalization techniques, additional gains are noted.
Performance of speech recognition system severely degrades in noisy environment. Considering this, in this work, we present a method to improve performance of a Mizo digit recognition system in different noisy conditions using data augmentation and tonal information. Mizo is a tonal language and each digit in Mizo is spoken with one of the four tones present in the language. Therefore, the tone contains information about the spoken digit. Tone is related to the excitation source and excitation source information is robust to noisy conditions when compared with the vocal tract information. Normalized cross correlation function, pitch and pitch dynamics are used as additional features to represent the tonal information and improvement is achieved in Mel frequency cepstral coefficient (MFCC) based baseline systems in noisy conditions. Data augmentation is another technique used in the literature for robust speech recognition. Use of data augmentation further improves the performance of the Mizo digit recognition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.