1994
DOI: 10.1006/jcph.1994.1001
|View full text |Cite
|
Sign up to set email alerts
|

GEMMW: A Portable Level 3 BLAS Winograd Variant of Strassen's Matrix-Matrix Multiply Algorithm

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
56
0

Year Published

1994
1994
2015
2015

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 63 publications
(56 citation statements)
references
References 3 publications
0
56
0
Order By: Relevance
“…The best asymptotic complexity for this computation has been successively improved since then, down to O`n 2.376´i n [5] (see [3,4] for a review), but Strassen-Winograd's still remains one of the most practicable. Former studies on how to turn this algorithm into practice can be found in [2,9,10,6] and references therein for numerical computation and in [15,7] for computations over a finite field.…”
Section: Introductionmentioning
confidence: 99%
“…The best asymptotic complexity for this computation has been successively improved since then, down to O`n 2.376´i n [5] (see [3,4] for a review), but Strassen-Winograd's still remains one of the most practicable. Former studies on how to turn this algorithm into practice can be found in [2,9,10,6] and references therein for numerical computation and in [15,7] for computations over a finite field.…”
Section: Introductionmentioning
confidence: 99%
“…For example, the schedule proposed in [DHSS94], with two temporaries only, can be revisited for the sequence seen in §2.2 as follows:…”
Section: S Schedulingmentioning
confidence: 99%
“…We compare the internal cubic GP-Pari implementation (Version 2.3.2, with GMP 4.2.1) and a GP-Pari script implementing Strassen-Winograd's ("SW", one level only) scheduled as in [DHSS94] with another scripts using the new proposed sequence described in §2.2 (one recursion level), then one implementing ψ(A) → ψ(A 2 ) which is the building brick for exponentiation, and a last one using also fig.1 for 2×2 (or 3×3) matrices. The last column gives percentage difference between ψ and non-ψ implementation.…”
Section: T Timingsmentioning
confidence: 99%
“…These MAs make the algorithm faster, however they make it weakly numerically stable and not unstable [Higham 2002]. As the starting point for our hybrid adaptive algorithm we use Winograd's algorithm (e.g., [Douglas et al 1994]), which requires only 15 MAs. Thus, Winograd's algorithm has, like the original by Strassen, asymptotic operation count O(n 2.81 ), but it has a smaller constant factor and thus fewer operations than Strassen's algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…(3) At every recursive step, we use only 3 temporary matrices, which is the minimum number possible [Douglas et al 1994]. Furthermore, we differ from Douglas et al work in that we do not perform redundant computations for odd-size matrices.…”
Section: Introductionmentioning
confidence: 99%