2019
DOI: 10.1049/joe.2018.9178
|View full text |Cite
|
Sign up to set email alerts
|

GPU computing performance analysis on matrix multiplication

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 16 publications
(18 reference statements)
0
4
0
Order By: Relevance
“…The evaluation of the performance of the MMM in diverse platforms such as GPUs and multi-core processors has also been carried out in terms of execution time [35,36,37], among others. According to the reliability enhancement, there are both hardware and software solutions.…”
Section: Related Workmentioning
confidence: 99%
“…The evaluation of the performance of the MMM in diverse platforms such as GPUs and multi-core processors has also been carried out in terms of execution time [35,36,37], among others. According to the reliability enhancement, there are both hardware and software solutions.…”
Section: Related Workmentioning
confidence: 99%
“…Here, we represent membrane systems as matrices that can be divided into sub-blocks to balance the number of threads used in GPU thread blocks [35,36]. The objects in the membranes are subsequently assigned to matrix entries (Figure 4), thereby increasing the efficiency with which the matrix allocates the threads in the thread blocks.…”
Section: Proposed Approachmentioning
confidence: 99%
“…It determines whether assigning sub-matrices, including additional membranes to each thread block, will cause a decrease in communications between threads and increase GPU occupancy. Matrices are apportioned into sub-blocks to fully utilize the maximum possible quantity of threads in each thread block [35,36]. This method eliminates shortcomings associated with previously implemented methods that applied one of the ensuing two notions: (i) allocating any quantity of objects in every membrane to every thread block or (ii) first designating an active membrane system in which the quantity of objects in every membrane achieves the highest quantity of threads in a GPU thread block.…”
Section: Introductionmentioning
confidence: 99%
“…The key is to distribute the computation between multiple threads where it can be done concurrently. The computation can be divided based on multiple memory hierarchies, where sub-tasks use different levels of memory, or multiple computation resources such as threads or machines [93,94,95].…”
Section: Advances In Matrix Multiplicationmentioning
confidence: 99%