With the increasing popularity of parallel programming environments such as PC clusters, more and more sequential programmers, with little knowledge about parallel architectures and parallel programming, are hoping to write parallel programs. Numerous attempts have been made to develop high-level parallel programming libraries that use abstraction to hide low-level concerns and reduce difficulties in parallel programming. Among them, libraries of parallel skeletons have emerged as a promising way towards this direction. Unfortunately, these libraries are not well accepted by sequential programmers, because of incomplete elimination of lower-level details, ad-hoc selection of library functions, unsatisfactory performance, or lack of convincing application examples. This paper addresses principle of designing skeleton libraries of parallel programming and reports implementation details and practical applications of a skeleton library SkeTo. The SkeTo library is unique in its feature that it has a solid theoretical foundation based on the theory of Constructive Algorithmics, and is practical to be used to describe various parallel computations in a sequential manner.
A method for extracting lexical translations from non-aligned corpora is proposed to cope with the unavailability of large aligned corpus. The assumption that "translations of two co-occurring words in a source language also co-occur in the target language" is adopted and represented in the stochastic matrix formulation. The translation matrix provides the co-occurring information translated from the source into the target. This translated co-occurring information should resemble that of the original in the target when the ambiguity of the translational relation is resolved. An algorithm to obtain the best translation matrix is introduced. Some experiments were performed to evaluate the effectiveness of the ambiguity resolution and the refinement of the dictionary.
Existing work that deals with parallelization of complicated reductions and scans focuses only on formalism and hardly dealt with implementation. To bridge the gap between formalism and implementation, we have integrated parallelization via matrix multiplication into compiler construction. Our framework can deal with complicated loops that existing techniques in compilers cannot parallelize. Moreover, we have sophisticated our framework by developing two sets of techniques. One enhances its capability for parallelization by extracting max-operators automatically, and the other improves the performance of parallelized programs by eliminating redundancy. We have also implemented our framework and techniques as a parallelizer in a compiler. Experiments on examples that existing compilers cannot parallelize have demonstrated the scalability of programs parallelized by our implementation.
Tupling is a well-known transformation tactic to obtain new efficient recursive functions by grouping some recursive functions into a tuple. It may be applied to eliminate multiple traversals over the common data structure. The major difficulty in tupling transformation is to find what functions are to be tupled and how to transform the tupled function into an efficient one. Previous approaches to tupling transformation are essentially based on fold/unfold transformation. Though general, they suffer from the high cost of keeping track of function calls to avoid infinite unfolding, which prevents them from being used in a compiler.To remedy this situation, we propose a new method to expose recursive structures in recursive definitions and show how this structural information can be explored for calculating out efficient programs by means of tupling. Our new tupling calculation algorithm carseliminate most of multiple data traversals and is easy to be implemented.1 Int roduct ion Tupling [Bir84, Chi93] is a well-known transformation tactic to obtain new efficient recursive functions without multiple tmversals over the common data structure (or multiple data traversals for short ), which is achieved by grouping some recursive functions into a tuple. As a typical example, consider the function deepest, which finds a list of leaves that are farthest away from the root of a given tree:deepest (Leaf (a)) =: deepest (Node(l, r-)) =:-. -.-. depth (Leaf (a))~d epth (Node(l, r)) =, [a] deepest (t), depth(l) > depth(r-) deepest(l) ++-deepest(r), depth(l) = depth(r) deepest (r-), otherwise o 1 + maz(depth(l), depth(r))The infix binary function it concatenates two lists and the function mas gives the maximum of the two arguments. Being concise, this definition is quite inefficient because deepest and depth traverse over the same input tree, giving many Permission to make digital/hard copy of part or all this work for personal or classroom use ia granted without fee provided that copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication and its date appear, and notice is given that copying is by permission of ACM, Inc. To copy otherwise, to republish, to post on servars, or to redistribute to lists, requires prior specific permission and/or a fee. ICFP '97 Amsterdam, ND 0 1997 ACM 0-89791 -918 -1/97 /0006 . ..$3.50 Hitachi, Ltd. repeated computations in computing the depth of subtrees. It, however, can be improved with tupling transformation by grouping deepest and depth to a new function (say cM),i.e. dd t = (deepest t, depth t), giving the following efficient program. deepest t -dd (Leaj (a))d d (Node(l, r)) = = --let (u, v) = dd t in u ([a], O) (dpl, 1 + all), dl > dr-(dpi i-t dpr, 1 + all), dl = d.
It has been attracting much attention to make use of list homomorphisms in parallel programming because they ideally suit the divide-and-conquer parallel paradigm. However, they have been usually treated rather informally and ad hoc in the development of efficient parallel programs. What is worse is that some interesting functions, e.g., the maximum segment sum problem, are basically not list homomorphisms. In this article, we propose a systematic and formal way for the construction of a list homomorphism for a given problem so that an efficient parallel program is derived. We show, with several well-known but nontrivial problems, how a straightforward, and "obviously" correct, but quite inefficient solution to the problem can be successfully turned into a semantically equivalent "almost list homomorphism." The derivation is based on two transformations, namely tupling and fusion, which are defined according to the specific recursive structures of list homomorphisms.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.