Alan Chan scite author profile

Alan Chan

5Publications

97Citation Statements Received

88Citation Statements Given

How they've been cited

133

How they cite others

Affiliations

University of Alberta

Publications

Order By: Most citations

Automatic prediction of tumour malignancy in breast cancer with fractal dimension

2016

View full text Add to dashboard Cite

Breast cancer is one of the most prevalent types of cancer today in women. The main avenue of diagnosis is through manual examination of histopathology tissue slides. Such a process is often subjective and error-ridden, suffering from both inter- and intraobserver variability. Our objective is to develop an automatic algorithm for analysing histopathology slides free of human subjectivity. Here, we calculate the fractal dimension of images of numerous breast cancer slides, at magnifications of 40×, 100×, 200× and 400×. Using machine learning, specifically, the support vector machine (SVM) method, the F1 score for classification accuracy of the 40× slides was found to be 0.979. Multiclass classification on the 40× slides yielded an accuracy of 0.556. A reduction of the size and scope of the SVM training set gave an average F1 score of 0.964. Taken together, these results show great promise in the use of fractal dimension to predict tumour malignancy.

show abstract

Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning

Asis

Chan

Pitis

et al. 2020

AAAI

View full text Add to dashboard Cite

We explore fixed-horizon temporal difference (TD) methods, reinforcement learning algorithms for a new kind of value function that predicts the sum of rewards over a fixed number of future time steps. To learn the value function for horizon h, these algorithms bootstrap from the value function for horizon h−1, or some shorter horizon. Because no value function bootstraps from itself, fixed-horizon methods are immune to the stability problems that plague other off-policy TD methods using function approximation (also known as “the deadly triad”). Although fixed-horizon methods require the storage of additional value functions, this gives the agent additional predictive power, while the added complexity can be substantially reduced via parallel updates, shared weights, and n-step bootstrapping. We show how to use fixed-horizon value functions to solve reinforcement learning problems competitively with methods such as Q-learning that learn conventional value functions. We also prove convergence of fixed-horizon temporal difference methods with linear and general function approximation. Taken together, our results establish fixed-horizon TD methods as a viable new way of avoiding the stability problems of the deadly triad.

show abstract

On the catalytic hydrolysis of carbonyl sulphide over gamma‐alumina

Chan

Lana

1978

Can J Chem Eng

View full text Add to dashboard Cite

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences

Chan¹,

Silva²,

Lim³

et al. 2021

Preprint

View full text Add to dashboard Cite

Approximate Policy Iteration (API) algorithms alternate between (approximate) policy evaluation and (approximate) greedification. Many different approaches have been explored for approximate policy evaluation, but less is understood about approximate greedification and what choices guarantee policy improvement. In this work, we investigate approximate greedification when reducing the KL divergence between the parameterized policy and the Boltzmann distribution over action values. In particular, we investigate the difference between the forward and reverse KL divergences, with varying degrees of entropy regularization. We show that the reverse KL has stronger policy improvement guarantees, but that reducing the forward KL can result in a worse policy. We also demonstrate, however, that a large enough reduction of the forward KL can induce improvement under additional assumptions. Empirically, we show on simple continuous-action environments that the forward KL can induce more exploration, but at the cost of a more suboptimal policy. No significant differences were observed in the discrete-action setting or on a suite of benchmark problems. Throughout, we highlight that many policy gradient methods can be seen as an instance of API, with either the forward or reverse KL for the policy update, and discuss next steps for understanding and improving our policy optimization algorithms.

show abstract

Efficient decorrelation of features using Gramian in Reinforcement Learning

Mavrin

Graves

Chan

2019

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alan Chan

Automatic prediction of tumour malignancy in breast cancer with fractal dimension

Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning

On the catalytic hydrolysis of carbonyl sulphide over gamma‐alumina

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences

Efficient decorrelation of features using Gramian in Reinforcement Learning

Contact Info

Product

Resources

About