2011
DOI: 10.1117/12.873004
|View full text |Cite
|
Sign up to set email alerts
|

Performance evaluation of canny edge detection on a tiled multicore architecture

Abstract: In the last few years, a variety of multicore architectures have been used to parallelize image processing applications. In this paper, we focus on assessing the parallel speed-ups of different Canny edge detection parallelization strategies on the Tile64, a tiled multicore architecture developed by the Tilera Corporation. Included in these strategies are different ways Canny edge detection can be parallelized, as well as differences in data management. The two parallelization strategies examined were loop-lev… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 2 publications
0
3
0
Order By: Relevance
“…Several threshold techniques [9,10,11,12,13] have been implemented to adapt threshold values to produce high quality edges. Some studies discussed edge detection parallelization to increase the execution speedup [14,15,16,17 ].…”
Section: Introductionmentioning
confidence: 99%
“…Several threshold techniques [9,10,11,12,13] have been implemented to adapt threshold values to produce high quality edges. Some studies discussed edge detection parallelization to increase the execution speedup [14,15,16,17 ].…”
Section: Introductionmentioning
confidence: 99%
“…The CED included in the Intel OpenCV Libray [3] offers fast edge detection, but does not perform optimally in a noisy environment (see Fig. 1 CED was implemented on a Tilera processor in [10], using loop-level parallelism and domain decomposition. The results of [10] show that domain decomposition offers better scalability compared to loop-level parallelism.…”
Section: Introductionmentioning
confidence: 99%
“…1 CED was implemented on a Tilera processor in [10], using loop-level parallelism and domain decomposition. The results of [10] show that domain decomposition offers better scalability compared to loop-level parallelism. However, their implementation is restricted to the Tilera64 architecture, and results are shown for only eight cores out of sixty-four available.…”
Section: Introductionmentioning
confidence: 99%