2020
DOI: 10.1371/journal.pcbi.1008022
|View full text |Cite
|
Sign up to set email alerts
|

Depth in convolutional neural networks solves scene segmentation

Abstract: Feed-forward deep convolutional neural networks (DCNNs) are, under specific conditions, matching and even surpassing human performance in object recognition in natural scenes. This performance suggests that the analysis of a loose collection of image features could support the recognition of natural object categories, without dedicated systems to solve specific visual subtasks. Research in humans however suggests that while feedforward activity may suffice for sparse scenes with isolated objects, additional vi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 22 publications
(13 citation statements)
references
References 52 publications
(75 reference statements)
1
12
0
Order By: Relevance
“…Emergence of object category representations can be delayed, for example when objects are occluded or are hard to categorize [44][45][46] . This suggests that object category representations might emerge with a delay also when objects appear on cluttered backgrounds, for example because additional grouping and segmentation operations are necessary that depend on recurrence and hence require additional time [47][48][49] .…”
Section: Object Category Representations In Timementioning
confidence: 99%
“…Emergence of object category representations can be delayed, for example when objects are occluded or are hard to categorize [44][45][46] . This suggests that object category representations might emerge with a delay also when objects appear on cluttered backgrounds, for example because additional grouping and segmentation operations are necessary that depend on recurrence and hence require additional time [47][48][49] .…”
Section: Object Category Representations In Timementioning
confidence: 99%
“…Recently, a multitude of studies have reconciled these seemingly inconsistent findings by indicating that recurrent processes might be employed adaptively, depending on the visual input: while feed-forward activity might suffice for simple scenes with isolated objects, more complex scenes or more challenging conditions (e.g. objects that are occluded or degraded), may need additional visual operations ('routines') requiring recurrent computations (Groen et al, 2018;Tang et al, 2018;Kar et al, 2019;Rajaei et al, 2019;Seijdel et al, 2020). For objects in isolation, or very simple scenes, rapid recognition may thus rely on a coarse and unsegmented feed-forward representation (Crouzet and Serre, 2011), while for more cluttered images recognition may require explicit encoding of spatial relationships between parts.…”
Section: Introductionmentioning
confidence: 99%
“…Besides, in the gender group, the males’ formats are mainly located on the low-frequency area, and subsequently, the texture on the males’ spectrogram repeats more irregularly compared to females’. Since the classic convolutional kernel utilized in these products is less effective in generalizing such irregular patterns due to shape mismatch 52 , the neural network-based feature extraction is restricted to further unearth the voice identity on these three models 53 , 54 . Moreover, similar situations can be observed in the rest research voice biometric models.…”
Section: Results Analysis and Discussionmentioning
confidence: 99%