Kathleen Knobe scite author profile

We introduce the Concurrent Collections (CnC) programming model. CnC supports flexible combinations of task and data parallelism while retaining determinism. CnC is implicitly parallel, with the user providing high-level operations along with semantic ordering constraints that together form a CnC graph. We formally describe the execution semantics of CnC and prove that the model guarantees deterministic computation. We evaluate the performance of CnC implementations on several applications and show that CnC offers performance and scalability equivalent to or better than that offered by lower-level parallel programming models.

show abstract

Array SSA form and its use in parallelization

Knobe

Sarkar

1998

110

View full text Add to dashboard Cite

Static single assignment (SSA) form for scalars has been a significant advance. It has simplified the way we think about scalar variables. It has simpliied the design of some optimizations and has made other optimizations more effective. Unfortunately none of thii can be be said for SSA form for arrays. The current SSA processing of arrays views an array as a single object. But the kinds of analyses that sophisticated compilers need to perform on arrays, for example those that drive loop parallelization, are at the element level. Current SSA form for arrays is incapable of providing the element-level data flow information required for such analyses.In this paper, we introduce an Array SSA form that captures precise element-level data flow information for array variables in all cases. It is general and simple, and coincides with standard SSA form when applied to scalar variables. It can also be used for structures and other variable types that can be modeled as arrays. An important application of Array SSA form is in automatic parallelization. We show how Array SSA form can enable parallelization of any loop that is free of loop-carried true data dependences. This includes loops with loop-carried anti and output dependences, unanalyzable subscript expressions, and arbitrary control flow within an iteration. Array SSA form achieves this level of generality by making manifest its 4 functions as runtime computations in cases that are not amenable to compile-time analysis.

show abstract

Data optimization: Allocation of arrays to reduce communication on SIMD machines

Knobe

Lukas

Stelle³

1990

Journal of Parallel and Distributed Computing

178

View full text Add to dashboard Cite

Performance evaluation of concurrent collections on high-performance multicore computing systems

Chandramowlishwaran

Knobe

Vuduc

2010

View full text Add to dashboard Cite

This paper is the first extensive performance study of a recently proposed parallel programming model, called Concurrent Collections (CnC). In CnC, the programmer expresses her computation in terms of application-specific operations, partially-ordered by semantic scheduling constraints. The CnC model is well-suited to expressing asynchronous-parallel algorithms, so we evaluate CnC using two dense linear algebra algorithms in this style for execution on state-of-the-art multicore systems: (i) a recently proposed asynchronous-parallel Cholesky factorization algorithm, (ii) a novel and non-trivial "higher-level" partly-asynchronous generalized eigensolver for dense symmetric matrices.Given a well-tuned sequential BLAS, our implementations match or exceed competing multithreaded vendor-tuned codes by up to 2.6×. Our evaluation compares with alternative models, including ScaLAPACK with a shared memory MPI, OpenMP, Cilk++, and PLASMA 2.0, on Intel Harpertown, Nehalem, and AMD Barcelona systems. Looking forward, we identify new opportunities to improve the CnC language and runtime scheduling and execution.

show abstract

Unified Analysis of Array and Object References in Strongly Typed Languages

Fink

Knobe²,

Sarkar

2000

View full text Add to dashboard Cite

Declarative aspects of memory management in the concurrent collections parallel programming model

Budimlić

Chandramowlishwaran

Knobe

et al. 2009

View full text Add to dashboard Cite

Concurrent Collections (CnC) [8] is a declarative parallel language that allows the application developer to express their parallel application as a collection of high-level computations called steps that communicate via single-assignment data structures called items.A CnC program is specified in two levels. At the bottom level, an existing imperative language implements the computations within the individual computation steps. At the top level, CnC describes the relationships (ordering constraints) among the steps. The memory management mechanism of the existing imperative language manages data whose lifetime is within a computation step. A key limitation in the use of CnC for long-running programs is the lack of memory management and garbage collection for data items with lifetimes that are longer than a single computation step. Although the goal here is the same as that of classical garbage collection, the nature of problem and therefore nature of the solution is distinct. The focus of this paper is the memory management problem for these data items in CnC.We introduce a new declarative slicing annotation for CnC that can be transformed into a reference counting procedure for memory management. Preliminary experimental results obtained from a Cholesky example show that our memory management approach can result in space reductions for CnC data items of up to 28× relative to the baseline case of standard CnC without memory management.

show abstract

Stampede: a cluster programming middleware for interactive stream-oriented applications

Ramachandran

Nikhil²,

Rehg

et al. 2003

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Abstract-Emerging application domains such as interactive vision, animation, and multimedia collaboration display dynamic scalable parallelism and high-computational requirements, making them good candidates for executing on parallel architectures such as SMPs and clusters of SMPs. Stampede is a programming system that has many of the needed functionalities such as high-level data sharing, dynamic cluster-wide threads and their synchronization, support for task and data parallelism, handling of time-sequenced data items, and automatic buffer management. In this paper, we present an overview of Stampede, the primary data abstractions, the algorithmic basis of garbage collection, and the issues in implementing these abstractions on a cluster of SMPS. We also present a set of micromeasurements along with two multimedia applications implemented on top of Stampede, through which we demonstrate the low overhead of this runtime and that it is suitable for the streaming multimedia applications.

show abstract

Dead timestamp identification in Stampede

Harel

Mandviwala

Knobe³

et al.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kathleen Knobe

Concurrent Collections

Array SSA form and its use in parallelization

Data optimization: Allocation of arrays to reduce communication on SIMD machines

Performance evaluation of concurrent collections on high-performance multicore computing systems

Unified Analysis of Array and Object References in Strongly Typed Languages

Declarative aspects of memory management in the concurrent collections parallel programming model

Stampede: a cluster programming middleware for interactive stream-oriented applications

Dead timestamp identification in Stampede

Contact Info

Product

Resources

About