Theories of rhythmic perception propose that perceptual sampling operates in a periodic way, with alternating moments of high and low responsiveness to sensory inputs. This rhythmic sampling is linked to neural oscillations and thought to produce fluctuations in behavioural outcomes. Previous studies have revealed theta-and alpha-band behavioural oscillations in low-level visual tasks and object categorization. However, less is known about fluctuations in face perception, for which the human brain has developed a highly specialized network. To investigate this, we ran an online study (N = 179) incorporating the dense sampling technique with a dual-target rapid serial visual presentation (RSVP) paradigm. In each trial, a stream of object images was presented at 30 Hz and participants were tasked with detecting whether or not there was a face image in the sequence. On some trials, one or two (identical) face images (the target) were embedded in each stream. On dual-target trials, the targets were separated by an interstimulus interval (ISI) that varied between 0 to 633 ms. The task was to indicate the presence of the target and its gender if present. Performance varied as a function of ISI, with a significant behavioural oscillation in the face detection task at 7.5 Hz, driven mainly by the male target faces. This finding is consistent with a high theta-band-based fluctuation in visual processing. Such fluctuations might reflect rhythmic attentional sampling or, alternatively, feedback loops involved in updating top-down predictions.The brain is confronted with a constant influx of sensory input and yet is able to form a stable ongoing visual experience. Despite the seemingly continuous operation, evidence has shown that sensory sampling works in a periodic way, with moments of high and low responsiveness to external stimulations interleaving with each other. This temporal structure of visual processing is reflected in ongoing neural oscillations, where the phase and power of prestimulus theta-(3-7 Hz) and alpha-band (8-12 Hz) activity have been shown to predict perceptual