Xin Tong scite author profile

Summary. For high dimensional classification, it is well known that naively performing the Fisher discriminant rule leads to poor results due to diverging spectra and accumulation of noise. Therefore, researchers proposed independence rules to circumvent the diverging spectra, and sparse independence rules to mitigate the issue of accumulation of noise. However, in biological applications, often a group of correlated genes are responsible for clinical outcomes, and the use of the covariance information can significantly reduce misclassification rates. In theory the extent of such error rate reductions is unveiled by comparing the misclassification rates of the Fisher discriminant rule and the independence rule. To materialize the gain on the basis of finite samples, a regularized optimal affine discriminant (ROAD) is proposed. The ROAD selects an increasing number of features as the regularization relaxes. Further benefits can be achieved when a screening method is employed to narrow the feature pool before applying the ROAD method. An efficient constrained co‐ordinate descent algorithm is also developed to solve the associated optimization problems. Sampling properties of oracle type are established. Simulation studies and real data analysis support our theoretical results and demonstrate the advantages of the new classification procedure under a variety of correlation structures. A delicate result on continuous piecewise linear solution paths for the ROAD optimization problem at the population level justifies the linear interpolation of the constrained co‐ordinate descent algorithm.

show abstract

Posbist fault tree analysis of coherent systems

Huang

Tong

Zuo

2004

Reliability Engineering & System Safety

138

View full text Add to dashboard Cite

Neyman-Pearson classification algorithms and NP receiver operating characteristics

Tong

Feng

2018

Sci. Adv.

View full text Add to dashboard Cite

show abstract

A survey on Neyman‐Pearson classification and suggestions for future research

Tong

Feng

Zhao

2016

WIREs Computational Stats

View full text Add to dashboard Cite

In statistics and machine learning, classification studies how to automatically learn to make good qualitative predictions (i.e., assign class labels) based on past observations. Examples of classification problems include email spam filtering, fraud detection, market segmentation. Binary classification, in which the potential class label is binary, has arguably the most widely used machine learning applications. Most existing binary classification methods target on the minimization of the overall classification risk and may fail to serve some real-world applications such as cancer diagnosis, where users are more concerned with the risk of misclassifying one specific class than the other. Neyman-Pearson (NP) paradigm was introduced in this context as a novel statistical framework for handling asymmetric type I/II error priorities. It seeks classifiers with a minimal type II error subject to a type I error constraint under some user-specified level. Though NP classification has the potential to be an important subfield in the classification literature, it has not received much attention in the statistics and machine learning communities. This article is a survey on the current status of the NP classification literature. To stimulate readers' research interests, the authors also envision a few possible directions for future research in NP paradigm and its applications.

show abstract

Prediction-based superpage-friendly TLB designs

Papadopoulou

Tong

Seznec

et al. 2015

View full text Add to dashboard Cite

International audience—This work demonstrates that a set of commercial and scale-out applications exhibit significant use of superpages and thus suffer from the fixed and small superpage TLB structures of some modern core designs. Other processors better cope with superpages at the expense of using power-hungry and slow fully-associative TLBs. We consider alternate designs that allow all pages to freely share a single, power-efficient and fast set-associative TLB. We propose a prediction-guided multi-grain TLB design that uses a superpage prediction mechanism to avoid multiple lookups in the common case. In addition, we evaluate the previously proposed skewed TLB [1] which builds on principles similar to those used in skewed associative caches [2]. We enhance the original skewed TLB design by using page size prediction to increase its effective associativity. Our prediction-based multi-grain TLB design delivers more hits and is more power efficient than existing alternatives. The predictor uses a 32-byte prediction table indexed by base register values. I. INTRODUCTION Over the last 50 years virtual memory has been an intrinsic facility of computer systems, providing each process the illusion of an equally large and contiguous address space while enforcing isolation and access control. Page tables act as gate-keepers; they maintain the mappings of virtual pages to physical frames (i.e., translations) along with additional information (e.g., access privileges). Except for a reserved part of memory, any code or data structure which currently resides in the computer's physical memory has such a translation. Page tables are usually organized as multi-level, hierarchical tables with four levels being common for 64-bit systems. Therefore multiple sequential memory references are necessary to retrieve the translation of the smallest supported page size. Hardware Translation Lookaside Buffers (TLBs) cache translations which are the result of accessing the page table (page-walk). A TLB access is in the critical path of each instruction fetch and memory reference; the translation is needed to complete the tag comparison in physically-tagged L1 caches. Thus, a short TLB latency is crucial. There are technology trends that compound making TLB performance and energy critical in today's systems. Physical memory sizes and application footprints have been increasing without a commensurate increase in TLB size and thus coverage. As a result, while TLBs still reap the benefits of spatial and temporal locality due to their entries' coarse tracking granularity, they now fall short of growing workload footprints. The use of superpages (i.e., large contiguous virtual memory regions which map to contiguous physical frames) can exten

show abstract

Modeling and parameter optimization for the design of vibrating screens

Tong

Zhou

et al. 2015

Minerals Engineering

View full text Add to dashboard Cite

Characteristics and efficiency of a new vibrating screen with a swing trace

Xiao

Tong

2013

Particuology

View full text Add to dashboard Cite

Towards Hypotheses on Creativity in Software Development

Tong

2004

View full text Add to dashboard Cite

Abstract. Though more and more researchers have realized the importance of creativity in software development, there are few empirical studies reported on this topic. In this paper, we present an exploratory empirical research in which several issues on creativity in software development are studied, that is, which development phases are perceived to include more creative work, whether or not UML-based documentation can make developers perceive more time is devoted to creative work, whether or not more creative work can accelerate the software development speed, and whether developers more prefer to do the creative work. Based on result analyses, we proposed four hypotheses to direct the future research in this field and discussed the challenge that 'since developers do not like to participate in those improving activities (quality assuring activities), how can we keep and improve software quality effectively and efficiently?'

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xin Tong

A Road to Classification in High Dimensional Space: The Regularized Optimal Affine Discriminant

Posbist fault tree analysis of coherent systems

Neyman-Pearson classification algorithms and NP receiver operating characteristics

A survey on Neyman‐Pearson classification and suggestions for future research

Prediction-based superpage-friendly TLB designs

Modeling and parameter optimization for the design of vibrating screens

Characteristics and efficiency of a new vibrating screen with a swing trace

Towards Hypotheses on Creativity in Software Development

Contact Info

Product

Resources

About