Hideya Iwasaki scite author profile

With the increasing popularity of parallel programming environments such as PC clusters, more and more sequential programmers, with little knowledge about parallel architectures and parallel programming, are hoping to write parallel programs. Numerous attempts have been made to develop high-level parallel programming libraries that use abstraction to hide low-level concerns and reduce difficulties in parallel programming. Among them, libraries of parallel skeletons have emerged as a promising way towards this direction. Unfortunately, these libraries are not well accepted by sequential programmers, because of incomplete elimination of lower-level details, ad-hoc selection of library functions, unsatisfactory performance, or lack of convincing application examples. This paper addresses principle of designing skeleton libraries of parallel programming and reports implementation details and practical applications of a skeleton library SkeTo. The SkeTo library is unique in its feature that it has a solid theoretical foundation based on the theory of Constructive Algorithmics, and is practical to be used to describe various parallel computations in a sequential manner.

show abstract

Characterizing Feasible Pattern Sets with a Minimum Number of Breaks

Miyashiro

Iwasaki

Matsui

2003

View full text Add to dashboard Cite

Extraction of lexical translations from non-aligned corpora

Tanaka

Iwasaki

1996

View full text Add to dashboard Cite

A method for extracting lexical translations from non-aligned corpora is proposed to cope with the unavailability of large aligned corpus. The assumption that "translations of two co-occurring words in a source language also co-occur in the target language" is adopted and represented in the stochastic matrix formulation. The translation matrix provides the co-occurring information translated from the source into the target. This translated co-occurring information should resemble that of the original in the target when the ambiguity of the translational relation is resolved. An algorithm to obtain the best translation matrix is introduced. Some experiments were performed to evaluate the effectiveness of the ambiguity resolution and the refinement of the dictionary.

show abstract

Automatic parallelization via matrix multiplication

Sato

Iwasaki

2011

SIGPLAN Not.

View full text Add to dashboard Cite

Existing work that deals with parallelization of complicated reductions and scans focuses only on formalism and hardly dealt with implementation. To bridge the gap between formalism and implementation, we have integrated parallelization via matrix multiplication into compiler construction. Our framework can deal with complicated loops that existing techniques in compilers cannot parallelize. Moreover, we have sophisticated our framework by developing two sets of techniques. One enhances its capability for parallelization by extracting max-operators automatically, and the other improves the performance of parallelized programs by eliminating redundancy. We have also implemented our framework and techniques as a parallelizer in a compiler. Experiments on examples that existing compilers cannot parallelize have demonstrated the scalability of programs parallelized by our implementation.

show abstract

Tupling calculation eliminates multiple data traversals

Iwasaki

Takeichi

et al. 1997

View full text Add to dashboard Cite

Tupling is a well-known transformation tactic to obtain new efficient recursive functions by grouping some recursive functions into a tuple. It may be applied to eliminate multiple traversals over the common data structure. The major difficulty in tupling transformation is to find what functions are to be tupled and how to transform the tupled function into an efficient one. Previous approaches to tupling transformation are essentially based on fold/unfold transformation. Though general, they suffer from the high cost of keeping track of function calls to avoid infinite unfolding, which prevents them from being used in a compiler.To remedy this situation, we propose a new method to expose recursive structures in recursive definitions and show how this structural information can be explored for calculating out efficient programs by means of tupling. Our new tupling calculation algorithm carseliminate most of multiple data traversals and is easy to be implemented.1 Int roduct ion Tupling [Bir84, Chi93] is a well-known transformation tactic to obtain new efficient recursive functions without multiple tmversals over the common data structure (or multiple data traversals for short ), which is achieved by grouping some recursive functions into a tuple. As a typical example, consider the function deepest, which finds a list of leaves that are farthest away from the root of a given tree:deepest (Leaf (a)) =: deepest (Node(l, r-)) =:-. -.-. depth (Leaf (a))~d epth (Node(l, r)) =, [a] deepest (t), depth(l) > depth(r-) deepest(l) ++-deepest(r), depth(l) = depth(r) deepest (r-), otherwise o 1 + maz(depth(l), depth(r))The infix binary function it concatenates two lists and the function mas gives the maximum of the two arguments. Being concise, this definition is quite inefficient because deepest and depth traverse over the same input tree, giving many Permission to make digital/hard copy of part or all this work for personal or classroom use ia granted without fee provided that copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication and its date appear, and notice is given that copying is by permission of ACM, Inc. To copy otherwise, to republish, to post on servars, or to redistribute to lists, requires prior specific permission and/or a fee. ICFP '97 Amsterdam, ND 0 1997 ACM 0-89791 -918 -1/97 /0006 . ..$3.50 Hitachi, Ltd. repeated computations in computing the depth of subtrees. It, however, can be improved with tupling transformation by grouping deepest and depth to a new function (say cM),i.e. dd t = (deepest t, depth t), giving the following efficient program. deepest t -dd (Leaj (a))d d (Node(l, r)) = = --let (u, v) = dd t in u ([a], O) (dpl, 1 + all), dl > dr-(dpi i-t dpr, 1 + all), dl = d.

show abstract

Formal derivation of efficient parallel programs by construction of list homomorphisms

Iwasaki

Takechi

1997

ACM Trans. Program. Lang. Syst.

View full text Add to dashboard Cite

It has been attracting much attention to make use of list homomorphisms in parallel programming because they ideally suit the divide-and-conquer parallel paradigm. However, they have been usually treated rather informally and ad hoc in the development of efficient parallel programs. What is worse is that some interesting functions, e.g., the maximum segment sum problem, are basically not list homomorphisms. In this article, we propose a systematic and formal way for the construction of a list homomorphism for a given problem so that an efficient parallel program is derived. We show, with several well-known but nontrivial problems, how a straightforward, and "obviously" correct, but quite inefficient solution to the problem can be successfully turned into a semantically equivalent "almost list homomorphism." The derivation is based on two transformations, namely tupling and fusion, which are defined according to the specific recursive structures of list homomorphisms.

show abstract

A Calculational Fusion System HYLO

Onoue

Takeichi

et al. 1997

View full text Add to dashboard Cite

A Skeletal Parallel Framework with Fusion Optimizer for GPGPU Programming

Sato

Iwasaki

2009

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hideya Iwasaki

A library of constructive skeletons for sequential style of parallel programming

Characterizing Feasible Pattern Sets with a Minimum Number of Breaks

Extraction of lexical translations from non-aligned corpora

Automatic parallelization via matrix multiplication

Tupling calculation eliminates multiple data traversals

Formal derivation of efficient parallel programs by construction of list homomorphisms

A Calculational Fusion System HYLO

A Skeletal Parallel Framework with Fusion Optimizer for GPGPU Programming

Contact Info

Product

Resources

About