A parallel interior point algorithm for linear programming on a network of transputers

Bisseling, Rob H.; Doup, T.M.; Loyens, L. D. J. C.

doi:10.1007/bf02024486

Cited by 14 publications

(15 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Consequently, cut net n 8,7 incurs a cost of λ(n 8,7 ) − 1 = 3 − 1 = 2 to the cutsize. Since v 8,7 Figure 2 processor P 1 is responsible for accumulating the partial nonzero results obtained from the outer-product computations. P 2 will send the partial result c …”

Section: The Results Of Its Local Outer-product Computations and Sendmentioning

confidence: 99%

“…Hence, accumulation of c 8,7 by P 1 will incur a communication cost of two words. Therefore, we have the equivalence between λ(n 8,7 ) − 1 and the communication volume regarding the accumulation of c 8,7 in the summation phase. Similarly, since λ(n 8,4 ) − 1 = 1, λ(n 11,7 ) − 1 = 1, and λ(n 11,4 ) − 1 = 1 for the other cut nets, the total cutsize is five.…”

Section: C577mentioning

confidence: 99%

“…Introduction. Sparse matrix-matrix multiplication (SpGEMM) is a kernel operation in a wide variety of scientific applications such as finite element simulations based on domain decomposition [3,22], molecular dynamics (MD) [15,16,17,25,28,29,32,36], and linear programming (LP) [7,8,26], all of which utilize parallel processing technology to reduce execution times. Among these applications, below we exemplify three methods/codes from which we select realistic SpGEMM instances.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Simultaneous Input and Output Matrix Partitioning for Outer-Product--Parallel Sparse Matrix-Matrix Multiplication

Akbudak¹,

Aykanat²

2014

SIAM J. Sci. Comput.

View full text Add to dashboard Cite

Abstract. For outer-product-parallel sparse matrix-matrix multiplication (SpGEMM) of the form C = A×B, we propose three hypergraph models that achieve simultaneous partitioning of input and output matrices without any replication of input data. All three hypergraph models perform conformable one-dimensional (1D) columnwise and 1D rowwise partitioning of the input matrices A and B, respectively. The first hypergraph model performs two-dimensional (2D) nonzero-based partitioning of the output matrix, whereas the second and third models perform 1D rowwise and 1D columnwise partitioning of the output matrix, respectively. This partitioning scheme induces a two-phase parallel SpGEMM algorithm, where communication-free local SpGEMM computations constitute the first phase and the multiple single-node-accumulation operations on the local SpGEMM results constitute the second phase. In these models, the two partitioning constraints defined on weights of vertices encode balancing computational loads of processors during the two separate phases of the parallel SpGEMM algorithm. The partitioning objective of minimizing the cutsize defined over the cut nets encodes minimizing the total volume of communication that will occur during the second phase of the parallel SpGEMM algorithm. An MPI-based parallel SpGEMM library is developed to verify the validity of our models in practice. Parallel runs of the library for a wide range of realistic SpGEMM instances on two large-scale parallel systems JUQUEEN (an IBM BlueGene/Q system) and SuperMUC (an Intel-based cluster) show that the proposed hypergraph models attain high speedup values. 1. Introduction. Sparse matrix-matrix multiplication (SpGEMM) is a kernel operation in a wide variety of scientific applications such as finite element simulations based on domain decomposition [3,22], molecular dynamics (MD) [15,16,17,25,28,29,32,36], and linear programming (LP) [7,8,26], all of which utilize parallel processing technology to reduce execution times. Among these applications, below we exemplify three methods/codes from which we select realistic SpGEMM instances.In finite element application fields, finite element tearing and interconnecting (FETI) [3,22] type domain decomposition methods are used for numerical solution of engineering problems. In this application, the SpGEMM computation GG T is performed, where G = R T B T , R is the block diagonal basis of the stiffness matrix, and B is the signed matrix with entries −1, 0, 1 describing the subdomain interconnectivity.In MD application fields, CP2K program [1] performs parallel atomistic and

show abstract

Section: The Results Of Its Local Outer-product Computations and Sendmentioning

confidence: 99%

Section: C577mentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Simultaneous Input and Output Matrix Partitioning for Outer-Product--Parallel Sparse Matrix-Matrix Multiplication

Akbudak¹,

Aykanat²

2014

SIAM J. Sci. Comput.

View full text Add to dashboard Cite

show abstract

“…The fi rst parallel interior-point LP solver we are aware of, was developed by Bisseling et al [4] at Shell in the early 1990's. The code was specially written for a network of transputers (distributed memory).…”

Section: Ldrd Final Report Onmentioning

confidence: 99%

“…We restrict our attention to 1-dimensional data decompositions, that is, matrices are partitioned either by rows or columns. There are some indications that 2-dimensional decompositions are better [4,40] and this option should be considered for future versions. We did not attempt 2-dimensional decompositions for the current code because this is more diffi cult to implement and has not been well tested in Epetra, our underlying parallel matrix library.…”

Section: Data Distributionmentioning

confidence: 99%

LDRD final report on massively-parallel linear programming : the parPCx system.

Parekh

Phillips²,

Boman³

2005

View full text Add to dashboard Cite

This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project "Massively-Parallel Linear Programming". We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runs on large numbers of processors.Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers 3 for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library.We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx. 4 AcknowledgmentWe thank Professor Steven Wright (University of Wisconsin) for his collaboration on this LDRD project. He provided useful insight on interior-point methods and the PCx code. We thank Vicki Howle (Sandia) for assistance with preconditioners. Jonathan Eckstein (Rutgers) designed and implemented the core portion of the PICO ramp up. We thank

show abstract

ICIAM/GAMM 95 Numerical Analysis, Scientific computing Computer ScienceICIAM/GAMM 95 Numerical Analysis, Scientific computing Computer Science

1996

Z. angew. Math. Mech.

View full text Add to dashboard Cite

show abstract

A parallel interior point algorithm for linear programming on a network of transputers

Cited by 14 publications

References 27 publications

Simultaneous Input and Output Matrix Partitioning for Outer-Product--Parallel Sparse Matrix-Matrix Multiplication

Simultaneous Input and Output Matrix Partitioning for Outer-Product--Parallel Sparse Matrix-Matrix Multiplication

LDRD final report on massively-parallel linear programming : the parPCx system.

ICIAM/GAMM 95 Numerical Analysis, Scientific computing Computer ScienceICIAM/GAMM 95 Numerical Analysis, Scientific computing Computer Science

Contact Info

Product

Resources

About