Final accepted version (with author's formatting)This version is available at: http://eprints.mdx.ac.uk/22276/
Copyright:Middlesex University Research Repository makes the University's research available electronically.Copyright and moral rights to this work are retained by the author and/or other copyright owners unless otherwise stated. The work is supplied on the understanding that any use for commercial gain is strictly forbidden. A copy may be downloaded for personal, non-commercial, research or study without prior permission and without charge.Works, including theses and research projects, may not be reproduced in any format or medium, or extensive quotations taken from them, or their content changed in any way, without first obtaining permission in writing from the copyright holder(s). They may not be sold or exploited commercially in any format or medium without the prior written permission of the copyright holder(s).Full bibliographic details must be given when referring to, or quoting from full items including the author's name, the title of the work, publication details where relevant (place, publisher, date), pagination, and for theses or dissertations the awarding institution, the degree type awarded, and the date of the award.If you believe that any material held in the repository infringes copyright law, please contact the Repository Team at Middlesex University via the following email address:eprints@mdx.ac.ukThe item will be removed from the repository while any claim is being investigated. ccfong@umac.mo, suashdeb@gmail.com, x.yang@mdx.ac.uk Abstract. Deep learning (DL) is one of the most emerging type of contemporary machine learning techniques that mimic the cognitive patterns of animal visual cortex to learn the new abstract features automatically by deep and hierarchical layers. DL is believed to be a suitable tool so far for extracting insights from very huge volume of so-called big data. Nevertheless, one of the three "V" or big data is velocity that implies the learning has to be incremental as data are accumulating up rapidly. DL must be fast and accurate. By the technical design of DL, it is extended from feed-forward artificial neural network with many multi-hidden layers of neurons called deep neural network (DNN). In the training process of DNN, it has certain inefficiency due to very long training time required. Obtaining the most accurate DNN within a reasonable run-time is a challenge, given there are potentially many parameters in the DNN model configuration, and high-dimensionality of the feature space in the training dataset. Meta-heuristic has a history of optimizing machine learning models successfully. How well meta-heuristic could be used to optimize DL in the context of big data analytics is a thematic topic which we pondered on in this paper. As a position paper, we review the recent advances of applying metaheuristics on DL, discuss about their pros and cons, and point out some feasible research directions for bridging the gaps between meta-heuristics and DL.Keyword...