Gangxiong Wu scite author profile

Gangxiong Wu

5Publications

83Citation Statements Received

66Citation Statements Given

How they've been cited

136

How they cite others

Affiliations

Nantong University, University of Electronic Science and Technology of China, Intel (United States)

Publications

Order By: Most citations

Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language

Newburn

Liu

et al. 2011

View full text Add to dashboard Cite

Our ability to create systems with large amount of hardware parallelism is exceeding the average software developer's ability to effectively program them. This is a problem that plagues our industry. Since the vast majority of the world's software developers are not parallel programming experts, making it easy to write, port, and debug applications with sufficient core and vector parallelism is essential to enabling the use of multi-and many-core processor architectures. However, hardware architectures and vector ISAs are also shifting and diversifying quickly, making it difficult for a single binary to run well on all possible targets. Because of this, retargetability and dynamic compilation are of growing relevance. This paper introduces Intel® Array Building Blocks (ArBB), which is a retargetable dynamic compilation framework. This system focuses on making it easier to write and port programs so that they can harvest data and thread parallelism on both multi-core and heterogeneous many-core architectures, while staying within standard C++. ArBB interoperates with other programming models to help meet the demands we hear from customers for a solution with both greater programmer productivity and good performance.This work makes contributions in language features, compiler architecture, code transformations and optimizations. It presents performance data from the current beta release of ArBB and quantitatively shows the impact of some key analyses, enabling transformations and optimizations for a variety of benchmarks that are of interest to our customers.

show abstract

Dual‐band circularly polarised planar monopole antenna for WLAN/Wi‐Fi/Bluetooth/WiMAX applications

Wei

Ding

et al. 2018

IET Microwaves, Antennas & Propagation

View full text Add to dashboard Cite

A novel planar monopole antenna with dual‐band circularly polarisation (CP) is presented here. The antenna consists of a modified L‐shaped monopole, an inverted‐L strip, and a modified ground. The utilisation of the modified L‐shaped monopole not only generates dual‐band operation, but also excites CP radiation in the lower band. In order to achieve CP in the upper band, a parasitic inverted‐L strip is added on the top plane of the substrate and a modified ground is designed. The proposed antenna has been fabricated and tested, the measured −10 dB reflection coefficient bandwidths are 370 MHz (2.38–2.85 GHz) in the lower band and 2330 MHz (4.05–6.38 GHz) in the upper band. The measured corresponding 3 dB axial ratio (AR) bandwidths are 610 MHz (2.39–3 GHz) and 850 MHz (5.15–6 GHz), respectively. The overlapped −10 dB reflection coefficient and AR bandwidths can totally cover the WLAN bands, Wi‐Fi bands, Bluetooth band, and partly cover the WiMAX bands. The proposed antenna owns bidirectional radiation characteristic and reasonable gain at both the lower band and the upper band.

show abstract

Data and Computation Transformations for Brook Streaming Applications on Multiprocessors

Liao

et al.

View full text Add to dashboard Cite

Pillar: A Parallel Implementation Language

Anderson

Glew

Guo

et al.

View full text Add to dashboard Cite

As parallelism in microprocessors becomes mainstream, new programming languages and environments are emerging to meet the challenges of parallel programming. To support research on these languages, we are developing a lowlevel language infrastructure called Pillar (derived from Parallel Implementation Language). Although Pillar programs are intended to be automatically generated from source programs in each parallel language, Pillar programs can also be written by expert programmers. The language is defined as a small set of extensions to C. As a result, Pillar is familiar to C programmers, but more importantly, it is practical to reuse an existing optimizing compiler like gcc [1] or Open64 [2] to implement a Pillar compiler. Pillar's concurrency features include constructs for threading, synchronization, and explicit data-parallel operations. The threading constructs focus on creating new threads only when hardware resources are idle, and otherwise executing parallel work within existing threads, thus minimizing thread creation overhead. In addition to the usual synchronization constructs, Pillar includes transactional memory. Its sequential features include stack walking, second-class continuations, support for precise garbage collection, tail calls, and seamless integration of Pillar and legacy code. This paper describes the design and implementation of the Pillar software stack, including the language, compiler, runtime, and high-level converters (that translate high-level language programs into Pillar programs). It also reports on early experience with three high-level languages that target Pillar. for these languages will have strong similarities. Pillar factors out these similarities and provides a single set of components to ease the implementation and optimization of a compiler and its runtime for any parallel language. The core idea of Pillar is to define a low-level language and runtime that can be used to express the sequential and concurrency features of higher-level parallel languages. The Pillar infrastructure consists of three main components: the Pillar language, a Pillar compiler, and the Pillar runtime.

show abstract

A comprehensive study of hardware/software approaches to improve TLB performance for java applications on embedded systems

Peng¹,

Lueh²,

Wu³

et al. 2006

View full text Add to dashboard Cite

The working set size of Java applications on embedded systems has recently been increasing, causing the Translation Lookaside Buffer (TLB) to become a serious performance bottleneck. From a thorough analysis of the SPECjvm98 benchmark suite executing on a commodity embedded system, we find TLB misses attribute from 24% to 50% of the total execution time. We explore and evaluate a wide spectrum of TLB-enhancing techniques with different combinations of software/hardware approaches, namely superpage for reducing TLB miss rates, two-level TLB and TLB prefetching for reducing both TLB miss rates and TLB miss latency, and even a no-TLB design for removing TLB overhead completely. We adapt and then in a novel way extend these approaches to fit the design space of embedded systems executing Java code. We compare these approaches, discussing their performance behavior, software/hardware complexity and constraints, especially the design implications for the application, runtime and OS.We first conclude that even with the aggressive approaches presented, there remains a performance bottleneck with the TLB. Second, in addition to facing very different design considerations and constraints for embedded systems, proven hardware techniques, such as TLB prefetching have different performance implications. Third, software based solutions, no-TLB design and superpaging, appear to be more effective in improving Java application performance on embedded systems. Finally, beyond performance, these approaches have their respective pros and cons; it is left to the system designer to make the appropriate engineering tradeoff.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Gangxiong Wu

Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language

Dual‐band circularly polarised planar monopole antenna for WLAN/Wi‐Fi/Bluetooth/WiMAX applications

Data and Computation Transformations for Brook Streaming Applications on Multiprocessors

Pillar: A Parallel Implementation Language

A comprehensive study of hardware/software approaches to improve TLB performance for java applications on embedded systems

Contact Info

Product

Resources

About